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METHODS OF REFOLDING MAMMALIAN 
GLYCOSYLTRANSFERASES 



FIELD OF INVENTION 
5 [0001] The present invention provides methods of refolding mammalian 

glycosyltransferases that have been produced in bacterial cells, including glycosyltransferase 
mutants that have enhanced ability to be refolded, and methods to use such refolded 
glycosyltransferases. The invention also provides methods of refolding more than one 
glycosyltransferase in a single vessel, methods to use such refolded glycosyltransferases, and 
10 reaction mixtures comprising the refolded glycosyltransferases. 

BACKGROUND OF THE INVENTION 
[0002] Eukaryotic organisms synthesize oligosaccharide structures or glycoconjugates, 
such as glycolipids or glycoproteins, that are commercially and therapeutically useful. In 
vitro synthesis of oligosaccharides or glycoconjugates can be carried out using recombinant 

15 eukaryotic glycosyltransferases. The most efficient method to produce recombinant 

eukaryotic glycosyltransferases for oligosaccharide synthesis is to express the protein in 
bacteria. However, in bacteria, many eukaryotic glycosyltransferases are expressed as 
insoluble proteins in bacterial inclusion bodies, and yields of active protein from the inclusion 
bodies can be very low. Thus, there is a need for improved methods to produce eukaryotic 

20 glycosyltransferases in bacteria. The present invention solves this and other needs. 

BRIEF SUMMARY OF THE INVENTION 
[0003] The present invention provides improved methods to refold insoluble eukaryotic 
glycosyltransferases in an active form and also provides glycosyltransferases, e.g., N- 
acetylglucosaminyltransferase I (GnTI) enzymes that have enhanced refolding properties. 

25 [0004] In one aspect, the invention provides a recombinant eukaryotic N- 

acetylglucosaminyltransferase I (GnTI) enzyme, that has been mutated to replace an impaired 
cysteine residue with an amino acid that enhances refolding of the enzyme from an insoluble 
precipitate, e.g., bacterial inclusion bodies. The GnTI enzyme includes at least the catalytic 
domain of the GnTI enzyme. The GnTI enzyme is biologically active, i.e., able to catalyze 

30 the transfer of a donor substrate to an acceptor substrate. 



[0005] In one embodiment, the GnTI enzyme is a human protein. Some mutations of the 
CYS121 residue in human GnTI enhance refolding. Those mutants include e.g., 
CYS121SER mutation, a CYS121ALA mutation, CYS121ASP mutation, and a double 
mutant, ARG120ALA, CYS121HIS. Representative sequences of GnTI mutants are shown 
5 in Figures 7-11. In other eukaryotes, e.g. , similar mutations of an unpaired cysteine residue, 
CYS123, enhance refolding of the GnTI enzyme. 

[0006] In another embodiment, the GnTI enzyme also includes an amino acid tag, e.g., a 
maltose binding protein (MBP), a polyhistidine tag, a glutathione S transferase (GST), a 
starch binding protein (SBP), and a myc epitope. 

10 [0007] In another aspect, the invention provides nucleic acids encoding a recombinant 

eukaryotic GnTI enzyme, that has been mutated to replace an impaired cysteine residue with 
an amino acid that enhances refolding of the enzyme from an insoluble precipitate, e.g., 
bacterial inclusion bodies. As above, the encoded GnTI enzyme includes at least the 
catalytic domain of the GnTI enzyme, and is biologically active, i.e., able to catalyze the 

1 5 transfer of a donor substrate to an acceptor substrate. 

[0008] In one embodiment, the nucleic acids encode a human GnTI enzyme. Some 
mutations of the CYS121 residue in human GnTI enhance refolding. Those mutants include 
e.g., CYS121SER mutation, a CYS121ALA mutation, CYS121 ASP mutation, and a double 
mutant, ARG120ALA, CYS121HIS. Representative nucleic acids sequences of GnTI 
20 mutant proteins and nucleic acids are shown in Figures 7-11. In other eukaryotes, e.g., 

similar mutations of an unpaired cysteine residue, CYS123, enhance refolding of the GnTI 
enzyme. 

[0009] In a further embodiment, the encoded GnTI enzyme also includes an amino acid tag, 
e.g., a maltose binding protein (MBP), a polyhistidine tag, a glutathione S transferase (GST), 
25 a starch binding protein (SBP), and a myc epitope. 

[0010] The invention also includes expression vectors that include the mutated GnTI 
nucleic acids, host cells that include the GnTI expression vectors, and methods of producing 
the mutated GnTI enzymes using the host/expression vector system. 

[0011] In another embodiment, the invention provides a method of adding N- 
30 acetylglucosamine residues to an acceptor molecule with a terminal mannose residue, by 
contacting the acceptor molecule with an activated N-acetylglucosamine molecule and a 
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eukaryotic GnTI enzyme that has been mutated to enhance refolding. The acceptor molecule 
can be e.g., a polysaccharide, an oligosaccharide, a glycolipid, or a glycoprotein. 

[0012] In another aspect, the invention provides a method of refolding at least two 
insoluble, recombinant eukaryotic glycosyltransferase proteins in a single vessel, by 
5 contacting the glycosyltransferases with a refolding buffer that includes a redox couple. 

After refolding, at least two of the refolded glycosyltransferases have biological activity, e.g., 
are able to catalyze the transfer of a donor substrate to an acceptor substrate. 

[0013] The refolding buffer can also include a detergent, or a chaotropic agent, or arginine, 
or PEG. In some embodiments the pH of the refolding buffer is between 6.0 and 10.0. In 
10 one embodiment, the pH of the refolding buffer is between 6.5 and 8.0. In another 
embodiment, the pH of the refolding buffer is between 8.0 and 9.0. 

[0014] In another embodiment, the glycosyltransferases include an amino acid tag, e.g., a 
maltose binding protein (MBP), a polyhistidine tag, a glutathione S transferase (GST), a 
starch binding protein (SBP), and a myc epitope 

15 [0015] In one embodiment, more than one glycosyltransferase from an N-linked glycan 
biosynthetic pathway are refolded together. 

[0016] In one embodiment, a sialyltransferase is refolded with another glycosyltranferase 
using the methods of the invention. 

[0017] In one embodiment, an N-acetylglucosaminyltransferase is refolded with another 
20 glycosyltranferase using the methods of the invention. 

[0018] In one embodiment, a galactosyltransferase is refolded with another 
glycosyltranferase using the methods of the invention. 

[0019] In another embodiment, a sialyltransferase, an N-acetylglucosaminyltransferase, and 
a galactosyltransferase are refolded together in a single vessel using the methods of the 
25 invention. 

[0020] In one embodiment, more than one glycosyltransferase from an O-linked glycan 
biosynthetic pathway are refolded together. In a further embodiment, a first enzyme is an N- 
acetylgalactosaminyltransferase. In a preferred embodiment, a first enzyme is an N- 
acetylglucosaminyltransferase 2 (GalNAcT2). 
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[0021] The present invention also provides a reaction mixture including a recombinant 
eukaryotic GnTI enzyme, that has been mutated to replace an unpaired cysteine residue with 
an amino acid that enhances refolding of the enzyme from an insoluble precipitate, e.g., 
bacterial inclusion bodies and at least one other glycosyltransferase that have been refolded in 
5 the same vessel. The second glycosyltransferase can be e.g., a sialyltransferase or a 
galactosyltransferase. In one embodiment, the reaction mixture includes the mutated 
eukaryotic GnTI enzyme, a sialyltransferase, and a galactosyltransferase. The reaction 
mixtures can be used with an acceptor molecule with a donor sugar, to produce e.g., a 
polysaccharide, an oligosaccharide, a glycolipid, or a glycoprotein. 

1 0 [0022] In another aspect, the invention provides a method of refolding an insoluble 

recombinant eukaryotic sialyltransferase, by (a) solubilizing the sialyltransferase; and then 
(b) contacting the soluble sialyltransferase with a refolding buffer including a redox couple. 
The refolded sialyltransferase is biologically active and catalyzes the transfer of sialic acid 
from a donor substrate to an acceptor substrate. In one embodiment, the refolded 

1 5 sialytransferase is dialyzed or diafiltered. 

[0023] The refolding buffer can also include a detergent, or a chaotropic agent, or arginine. 
In some embodiments the pH of the refolding buffer is between 6.0 and 10.0. In one 
embodiment, the pH of the refolding buffer is between 6.5 and 8.0. In another embodiment, 
the pH of the refolding buffer is between 8.0 and 9.0. 

20 [0024] In one embodiment, the redox couple in the refolding buffer is reduced 

glutathione/oxidized glutathione (GSH/GSSG). In a further embodiment, the molar ratio of 
GSH/GSSG is between 100: 1 and 1:10. In a preferred embodiment, the molar ratio of 
GSH/GSSG is 10:1. In a still further embodiment, the refolding buffer comprises about 0.02- 
10 mM GSH, 0.005-10 mM GSSG, 0.005-10 mM lauryl maltoside, 50-250 mM NaCl, 2-10 

25 mM KC1, 0.01-0.05% PEG 3350, and 150-550 mM I^arginine. 

[0025] In another embodiment, the sialyltransferase includes an amino acid tag e.g., 
maltose binding protein (MBP), a polyhistidine tag, a glutathione S transferase (GST), a 
starch binding protein (SBP), and a myc epitope. In a further embodiment, the 
sialyltransferase is purified using a tag binding molecule that binds to the amino acid tag. For 
30 example, the amino acid tag can be MBP and die tag binding molecule can be amylose, 
maltose, or a cyclodextrin. 
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[0026] In another embodiment, the refolded sialyltransferase catalyzes the transfer of sialic 
acid from CMP-sialic acid to a glycoprotein. 

[0027] In a further embodiment, the refolded sialyltransferase catalyzes the transfer of 
10KPEG or 20K PEG from CMP-SA-PEG(10 kDa) or CMP-SA-PEG(20 kDa) to a 
5 glycoprotein. 

[0028] In another embodiment, the sialyltransferase is rat liver ST3GalIII. 

[0029] In another aspect, the invention provides a method of adding a sialyl moiety to a 
glycoprotein, by contacting the glycoprotein with CMP-sialic acid with a refolded 
mammalian sialyltransferase that was refolded using the methods disclosed herein. 

1 0 [0030] In another aspect, the invention provides a method of adding a PEG moiety to a 
glycoprotein, the method comprising by contacting the glycoprotein with CMP-SA-PEG(10 
kDa) or CMP-SA-PEG(20 kDa) and a refolded mammalian sialyltransferase that was 
refolded using the methods disclosed herein. 

[0031] In a further aspect the invention provides a method of refolding an insoluble 
1 5 recombinant eukaryotic N-acetylgalactosaminyltransferase 2 (GalNAcT2) by solubilizing the 
GalNAcT2 in a solubilization buffer; and then contacting the soluble GalNAcT2 with a 
refolding buffer that includes a redox couple to refold the GalNAcT2. After refolding, the 
refolded GalNAcT2 catalyzes the transfer of N-acetylgalactosamine from a donor substrate to 
an acceptor substrate. The method can optionally include steps of dialyzing or diafiltering 
20 the refolded GalNAcT2 or further purification of the refolded GalNAcT2. 

[0032] In some embodiments the redox couple of the refolding buffer is reduced 
glutathione/oxidized glutathione (GSH/GSSG) or cysteine/ cystamine. The refolding buffer 
can also include the following: a detergent, a choatropic agent, or arginine. In some 
embodiments, the pH of the refolding buffer is between 6.0 and 10.0. In one preferred 
25 embodiment, the pH of the refolding buffer is about 8.0. 

[0033] In preferred embodiments, the solubilization buffer pH is between 6.0 and 10.0. In 
a more preferred embodiment, the solubilization buffer pH is about 8.0. 

[0034] The recombinanntly expressed GalNAcT2 can include an amino acid tag. The 
amino acid tag can be, e.g., a maltose binding protein (MBP), a polyhistidine tag, a 
30 glutathione S transferase (GST), a starch binding protein (SBP), or a myc epitope. A tag 
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binding molecule can be used to purify the refolded GalNAcT2. When the amino acid tag is 
MBP and the tag binding molecule is generally one of the following: amylose, maltose, or a 
cyclodextrin. 

[0035] In a preferred embodiment, the refolded GalNAcT2 catalyzes the transfer of N- 
5 acetylgalactosamine from a donor substrate to a peptide, protein, glycopeptide or 
glycoprotein. 

BRIEF DESCRIPTION OF THE DRAWINGS 
[0036] Figure 1 provides the buffer conditions tested in refolding MBP-ST3GalIII from 
bacterial inclusion bodies. The activity of the refolded enzymes is also provided. 

1 0 [0037] Figure 2 provides an elution profile of refolded and dialyzed MBP-ST3GaUII from 
an amylose column. 

[0038] Figure 3 provides the ST3GalIH activities of the elution fractions from the amylose 
column. 

[0039] Figure 4 provides the results of an assay of glycoPEGylation of transferrin using 
15 purified refolded MBP-ST3GalHL Lanes are as follows: (1) MW markers [250, 148, 98, 64, 

50 kD]; (2) Control asioalotransferrin with no enzyme, indicated by solid arrow; (3) 

transferrin-SA-PEG (20 kDa) production with Fraction #5, products indicated by arrowhead; 

(4) transferrin-SA-PEG (20 kDa) production with Fraction # 6, products indicated by 

arrowhead; (5) Purified, refolded MBP-ST3Gaini Fr # 6, indicated by dotted arrow; (6) 
20 MW markers; (7) same as 2; (8) transferrin-SA-PEG (10 kDa) production with Fr# 4, 

products indicated by brackets; and (9) transferrin-SA-PEG (10 kDa) production with Fr # 5, 

products indicated by brackets. 

[0040] Figure 5 provides the results of an assay of GlycoPEGylation of EPO using the 
refolded SuperGlycoMix. Lanes are as follows: (1) MW markers, SeeBlue2 

25 Invitrogen,(250, 148, 98, 64, 50, 36, 22, 16, 6 kD); (2) Positive control with EPO, + NSO 
expressed GalTl, BV GnTl, Aspergillus ST3Gaim and sugar nucleotides; (3) Negative 
control, Same as 2 without UDP-GlcNAc; (4) EPO, Purified and separately refolded MBP- 
GalTl(A129) C342T, Refolded MBP-GnTl(A103), and Aspergillus niger expressed 
ST3Gaim; (5) EPO, SuperGlycoMix (mixture of MBP-ST3Gaim, MBP-GalTl(A129) 

30 C342T, MBP-GnTl(A103)C123A and sugar nucleotides. 
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[0041] Figure 6 provides an alignment of a human GnTl amino acid sequence (top line, 
NP_002397) and a rabbit GnTl amino acid sequence (bottom line, P271 15). The conserved 
unpaired cysteines are underlined and in bold text. 

[0042] Figure 7 provides the amino acid sequence of a GnTl Cysl21Ser mutant and a 
5 nucleic acid sequence that encodes the mutant GnTl protein. The amino acid sequence 
depicted begins with amino acid residue 104 of the full length human protein and is 
representative of mammalian GnTl proteins with the following unpaired cysteine mutation: 
. . .stvrrsldkllh. . .., where the bold residue is mutated from the wild-type cysteine. 

[0043] Figure 8 provides the amino acid sequence of a GnTl Cysl21Asp mutant and a 
10 nucleic acid sequence that encodes the mutant GnTl protein. The amino acid sequence 
depicted begins with amino acid residue 104 of the full length human protein and is 
representative of mammalian GnTl proteins with the following unpaired cysteine mutation: 
. . .stvrrdldkllh. . ., where the bold residue is mutated from the wild-type cysteine. 

[0044] Figure 9 provides the amino acid sequence of a GnTl Cysl21Thr mutant and a 
15 nucleic acid sequence that encodes the mutant GnTl protein. The amino acid sequence 
depicted begins with amino acid residue 104 of the full length human protein and is 
representative of mammalian GnTl proteins with the following unpaired cysteine mutation: 
. . .stvrrtldkllh. . ., where the bold residue is mutated from the wild-type cysteine. 

[0045] Figure 10 provides the amino acid sequence of a GnTl Cysl21Ala mutant and a 
20 nucleic acid sequence that encodes the mutant GnTl protein. The amino acid sequence 
depicted begins with amino acid residue 104 of the full length human protein and is 
representative of mammalian GnTl proteins with the following unpaired cysteine mutation: 
. . .stvrraldkllh. . ., where the bold residue is mutated from the wild-type cysteine. 

[0046] Figure 1 1 provides the amino acid sequence of a GnTl Argl20Ala, Cysl21His 
25 mutant and a nucleic acid sequence that encodes the mutant GnTl protein. The amino acid 
sequence depicted begins with amino acid residue 104 of the full length human protein and is 
representative of mammalian GnTl proteins with the following double mutation: 
. . .stvrahldkllh. . ., where the bold residue is mutated from the wild-type cysteine. 

[0047] Figure 12 provides the amino acid sequence of rat liver ST3Gaim. The underlined 
30 and italicized sequence was deleted during cloning. 
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[0048] Figures 13A and 13B provide full length nucleic acid and amino acid sequences of 
UDP-N-acetylgalactosaminyltransferase 2 (GalNAcT2). The accession number of the 
nucleic acid and protein is NM_004481. 

[0049] Figures 14A and 14B provide nucleic acid and amino acid sequences of a 
5 A5 lGalNAcT2. The numbering is based on the full length amino acid and nucleic acid 
sequences shown in Figures 13A and B. 

[0050] Figure 15 provides a demonstration of the protein concentration of refolded MBP- 
GalNAcT2(D51) after solubilization at pH 6.5 or pH 8.0 and refolding at pH 6.5 or pH 8.0. 
The pH values tested are expressed as solubilization pH-refolding pH. Protein concentrations 
1 0 were measured immediately after refolding (light gray bars), after dialysis (dark gray bars), 
and after concentration (white bars). 

. [0051] Figure 16 provides a demonstration of the enzymatic activity of refolded MBP- 
GalNAcT2(D51) after solubilization at pH 6.5 or pH 8.0 and refolding at pH 6.5 or pH 8.0. 
The pH values tested are expressed as solubilization pH-refolding pH. Activity was 
15 measured after dialysis (light gray bars) and after concentration (dark gray bars). 

[0052] Figure 17 provides a demonstration of the specific activity of refolded MBP- 
GalNAcT2(D51) after solubilization at pH 6.5 or pH 8.0 and refolding at pH 6.5 or pH 8.0. 
The pH values tested are expressed as solubilization pH-refolding pH. Specific activity was 
measured after dialysis (white bars) and after concentration (dark gray bars). 

20 [0053] Figures 18A and 18B provide results of remodeling of recombinant granulocyte 
colony stimulating factor (GCSF) using refolded MBP-GalNAcT2(D51) after solubilization 
at pH 6.5 or pH 8.0 and refolding at pH 6.5 or pH 8.0. Figure 1 8A shows the results using a 
control purified MBP-GalNAcT2(D51), or a negative control that lacked a substrate, or 
bacterially expressed MBP-GalNAcT2(D51) that was solubilized at pH 6.5 and refolded at 

25 pH 6.5. Figure 18B shows the experimental results. 

[0054] Figure 1 9 provides a profile of refolded MBP-GalNAcT2(D5 1 ) proteins after 
elution from a Q Sepharose XL (QXL) column (Amersham Biosciences, Piscataway, NJ). 
The top of the figure shows a chromatogram illustrating the elution of MBP-GalNAcT2(D51) 
from the QXL column. Fraction numbers are indicated on the X-axis and the relative 
30 absorbance of each fraction is indicated on the Y-axis. The bottom shows an image of two 
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electrophoretic gels used to visualize the eluted fractions. The contents of each lane on the gel 
are described in the figure. 

[0055] Figure 20 provides the GalNAcT2 activity of specific column fractions from the 
QXL column shown in Figure 19. The most active fractions were applied to a 
5 Hydroxyapatite Type I (80um) (BioRad, Hercules, CA) column. 

[0056] Figure 2 1 provides a profile of refolded MBP-GalNAcT2(D5 1 ) proteins after 
elution from the HA type I column. The top of the figure shows a chromatogram illustrating 
the elution of MBP-GalNAcT2(D5 f) from the HA type I column. Fraction numbers are 
indicated on the X-axis and the relative absorbance of each fraction is indicated on the Y- 
10 axis. The bottom shows an image of an electrophoretic gel used to visualize the eluted 
fractions. The contents of each lane on the gel are described in the figure. 

[0057] Figure 22 provides the GalNAcT2 activity of HA type I eluted fractions. 

DEFINITIONS 

[0058] The recombinant glycosyltransferase proteins of the invention are useful for 
15 transferring a saccharide from a donor substrate to an acceptor substrate. The addition 

generally takes place at the non-reducing end of an oligosaccharide or carbohydrate moiety 
on a biomolecule. Biomolecules as defined here include but are not limited to biologically 
significant molecules such as carbohydrates, proteins (e.g., glycoproteins), and lipids (e.g., 
glycolipids, phospholipids, sphingo lipids and gangliosides). 

20 The following abbreviations are used herein: 

Ara = arabinosyl; 

Fru = fructosyl; 

Fuc = fucosyl; 

Gal = galactosyl; 
25 GalNAc = N-acetylgalactosylamino; 

Glc = glucosyl; 

GlcNAc = N-acetylglucosylamino; 
Man = mannosyl; and 
NeuAc = sialyl (N-acetymeuraminyl) 
30 FT or FucT = fucosyltransferase* 

ST = sialyltransferase* 
GalT = galactosyltransferase* 
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[0059 J Arabic or Roman numerals are used interchangeably herein according to the naming 
convention used in the art to indicate the identity of a specific glycosyltransferase (e.g., 
FTVII and FT7 refer to the same fucosyltransferase). 

[0060] Oligosaccharides are considered to have a reducing end and a non-reducing end, 
5 whether or not the saccharide at the reducing end is in fact a reducing sugar. In accordance 
with accepted nomenclature, oligosaccharides are depicted herein with the non-reducing end 
on the left and the reducing end on the right. 

[0061] All oligosaccharides described herein are described with the name or abbreviation 
for the non-reducing saccharide (e.g., Gal), followed by the configuration of the glycosidic 
10 bond (a or P), the ring bond, the ring position of the reducing saccharide involved in the 
bond, and then the name or abbreviation of the reducing saccharide (e.g., GlcNAc). The 
linkage between two sugars may be expressed, for example, as 2,3, 2— >3, or (2,3). Each 
saccharide is a pyranose or furanose. 

[0062] The term "sialic acid" refers to any member of a family of nine-carbon carboxylated 
15 sugars. The most common member of the sialic acid family is N-acetyl-neuraminic acid (2- 
keto-5-acetamido-3,5-dideoxy-D-glycero-D-galactononulopyranos- 1 -onic acid (often 
abbreviated as Neu5 Ac, NeuAc, or NANA). A second member of the family is N-glycolyl- 
neuraminic acid (Neu5Gc or NeuGc), in which the N-acetyl group of NeuAc is hydroxylated. 
A third sialic acid family member is 2-keto-3-deoxy-nonulosonic acid (KDN) (Nadano et al. 
20 (1986) J. Biol. Chem. 261: 1 1550-1 1557; Kanamori et al, J. Biol. Chem. 265: 2181 1-21819 
(1990)). Also included are 9-substituted sialic acids such as a 9-0-Ci-C6 acyl-Neu5Ac like 
9-0-lactyl-Neu5Ac or 9-0-acetyl-Neu5Ac, 9-deoxy-9-fluoro-Neu5Ac and 9-azido-9-deoxy- 
Neu5Ac. For review of the sialic acid family, see, e.g., Varki, Glycobiology 2 : 25-40 (1992); 
Sialic Acids: Chemistry, Metabolism and Function, R. Schauer, Ed. (Springer-Verlag, New 
25 York (1992)). The synthesis and use of sialic acid compounds in a sialylation procedure is 
disclosed in international application WO 92/16640, published October 1, 1992. 

[0063] An "acceptor substrate" for a glycosyltransferase is an oligosaccharide moiety that 
can act as an acceptor for a particular glycosyltransferase. When the acceptor substrate is 
contacted with the corresponding glycosyltransferase and sugar donor substrate, and other 
30 necessary reaction mixture components, and the reaction mixture is incubated for a sufficient 
period of time, the glycosyltransferase transfers sugar residues from the sugar donor substrate 
to the acceptor substrate. The acceptor substrate will often vary for different types of a 
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particular glycosyltransferase. For example, the acceptor substrate for a mammalian 
galactoside 2-L-fucosyltransferase (al,2-fucosyltransferase) will include a Gaipi,4-GlcNAc- 
R at a non-reducing terminus of an oligosaccharide; this fucosyltransferase attaches a fucose 
residue to the Gal via an al ,2 linkage. Terminal Gaipi,4-GlcNAc-R and Gaipi,3-GlcNAc-R 
5 and sialylated analogs thereof are acceptor substrates for a 1 ,3 and a 1 ,4-fucosyltransferases, 
respectively. These enzymes, however, attach the fucose residue to the GlcNAc residue of 
the acceptor substrate. Accordingly, the term "acceptor substrate" is taken in context with the 
particular glycosyltransferase of interest for a particular application. Acceptor substrates for 
additional glycosyltransferases, are described herein. Acceptor substrates also include e.g., 
10 peptides, proteins, glycopeptides, and glycoproteins. 

[0064] A "donor substrate" for glycosyltransferases is an activated nucleotide sugar. Such 
activated sugars generally consist of uridine, guanosine, and cytidine monophosphate 
derivatives of the sugars (UMP, GMP and CMP, respectively) or diphosphate derivatives of 
the sugars (UDP, GDP and CDP, respectively) in which the nucleoside monophosphate or 
15 diphosphate serves as a leaving group. For example, a donor substrate for fucosyltransferases 
is GDP-fucose. Donor substrates for sialyltransferases, for example, are activated sugar 
nucleotides comprising the desired sialic acid. For instance, in the case of NeuAc, the 
activated sugar is CMP-NeuAc. 

[0065] A "eukaryotic N-acetylglucosarninyltransferase I (GnTI or GNTI)" as used herein, 
20 refers to a 0-1, 2-N- acetylglucosaminyltransferase I isolated from a eukaryotic organism. The 
enzyme catalyzes the transfer of N-acetylglucosamine (GlcNAc) from a UDP-GlcNAc donor 
to an acceptor molecule comprising a mannose sugar. Like other eukaryotic 
glycosyltransferases, GnTI has a transmembrane domain, a stem region, and a catalytic 
domain. 

25 [0066] A "eukaryotic A'-acetylgalactosaminyltransferase (GalNAcT)" as used herein, refers 
to an //-acetylgalactosaminyltransferase isolated from a eukaryotic organism. The enzyme 
catalyzes the transfer of ^acetylgalactosamine (GalNAc) from a UDP-GalNAc donor to an 
acceptor molecule. Like other eukaryotic glycosyltransferases, GalNAcT enzymes have a 
transmembrane domain, a stem region, and a catalytic domain. A number of GalNAcT 

30 enzymes have been isolated and characterized, e.g. , GalNAcT 1 , accession number X850 1 8; 
GalNAcT2, accession number X85019 (both described in White et al., J. Biol. Chem. 
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270:24156-24165 (1995)); and GalNAcT3, accession number X92689 (described in Bennett 
et al., J. Biol. Chem. 271:17006-17012 (1996)). 

[0067] An "unpaired cysteine residue" as used herein, refers to a cysteine residue, which in 
a correctly folded protein {i.e., a protein with biological activity), does not form a disulfide 
5 bind with another cysteine residue. 

[0068] An "insoluble glycosyltransferase" refers to a glycosyltransferase that is expressed 
in bacterial inclusion bodies. Insoluble glycosyltransferases are typically solubilized or 
denatured using e.g., detergents or chaotropic agents or some combination. "Refolding" 
refers to a process of restoring the structure of a biologically active glycosyltransferase to a 
10 glycosyltransferase that has been solubilized or denatured. Thus, a refolding buffer, refers to 
a buffer that enhances or accelerates refolding of a glycosyltransferase. 

[0069] A "redox couple" refers to mixtures of reduced and oxidized thiol reagents and 
include reduced and oxidized glutathione (GSH/GSSG), cysteine/cystine, 
cysteamine/cystamine, DTT/GSSG, andDTE/GSSG. {See, e.g., Clark, Cur. Op. Biotech. 
15 12:202-207 (2001)). 

[0070] The term "contacting" is used herein interchangeably with the following: combined 
with, added to, mixed with, passed over, incubated with, flowed over, etc. 

[0071] The term "PEG" refers to poly(ethylene glycol). PEG is an exemplary polymer that 
has been conjugated to peptides. The use of PEG to derivatize peptide therapeutics has been 
20 demonstrated to reduce the immunogenicity of the peptides and prolong the clearance time 
from the circulation. For example, U.S. Pat. No. 4,179,337 (Davis et al.) concerns non- 
immunogenic peptides, such as enzymes and peptide hormones coupled to polyethylene 
glycol (PEG) or polypropylene glycol. Between 10 and 100 moles of polymer are used per 
mole peptide and at least 15% of the physiological activity is maintained. 

25 [0072] The term "specific activity" as used herein refers to the catalytic activity of an 

enzyme, e.g., a recombinant glycosyltransferase fusion protein of the present invention, and 
may be expressed in activity units. As used herein, one activity unit catalyzes the formation 
of 1 umol of product per minute at a given temperature {e.g., at 37°C) and pH value {e.g., at 
pH 7.5). Thus, 10 units of an enzyme is a catalytic amount of that enzyme where 10 umol of 

30 substrate are converted to 10 umol of product in one minute at a temperature of, e.g., 37 °C 
and a pH value of, e.g., 7.5. 
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[0073] "N-linked" oligosaccharides are those oligosaccharides that are linked to a peptide 
backbone through asparagine, by way of an asparagine-N-acetylglucosamine linkage. N- 
linked oligosaccharides are also called "N-glycans." All N-linked oligosaccharides have a 
common pentasaccharide core of Man 3 GlcNAc 2 . They differ in the presence of, and in the 
5 number of branches (also called antennae) of peripheral sugars such as N-acetylglucosamine, 
galactose, N-acetylgalactosamine, fucose and sialic acid. Optionally, this structure may also 
contain a core fucose molecule and/or a xylose molecule. 

[0074] "O-linked" oligosaccharides are those oligosaccharides that are linked to a peptide 
backbone through threonine, serine, hydroxyproline, tyrosine, or other hydroxy-containing 
10 amino acids. 

[0075] A "substantially uniform glycoform" or a "substantially uniform glycosylation 
pattern," when referring to a glycoprotein species, refers to the percentage of acceptor > 
substrates that are glycosylated by the glycosyltransferase of interest (e.g., 
fucosyltransferase). It will be understood by one of skill in the art, that the starting material 
15 may contain glycosylated acceptor substrates. Thus, the calculated amount of glycosylation 
will include acceptor substrates that are glycosylated by the methods of the invention, as well 
as those acceptor substrates already glycosylated in the starting material. 

[0076] The term "biological activity" refers to an enzymatic activity of a protein. For 
example, biological activity of a sialyltransferase refers to the activity of transferring a sialic 

20 acid moiety from a donor molecule to an acceptor molecule. Biological activity of a 

GalNAcT2 refers to the activity of transferring an iV-acetylgalactosamine moiety from a 
donor molecule to an acceptor molecule. For GalNAcT2 proteins, an acceptor molecule can 
be a protein, a peptide, a glycoprotein, or a glycopeptide. Biological activity of a GnTl 
protein refers to the activity of transferring a N-acetylglucosamine moiety from a donor 

25 molecule to an acceptor molecule. Biological activity of a galactosyltransferase refers to the 
activity of transferring a galactose moiety from a donor molecule to an acceptor molecule. 

[0077] "Commercial scale" refers to gram scale production of a product saccharide in a 
single reaction, hi preferred embodiments, commercial scale refers to production of greater 
than about 50, 75, 80, 90 or 100, 125, 150, 175, or 200 grams. 

30 [0078] The term "substantially" in the above definitions of "substantially uniform" 
generally means at least about 60%, at least about 70%, at least about 80%, or more 
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preferably at least about 90%, and still more preferably at least about 95% of the acceptor 
substrates for a particular glycosyltransferase are glycosylated. 

[0079] The term "amino acid" refers to naturally occurring and synthetic amino acids, as 
well as amino acid analogs and amino acid mimetics that function in a manner similar to the 
5 naturally occurring amino acids. Naturally occurring amino acids are those encoded by the 
genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, y- 
carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, i.e., an a carbon that is 
bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, 
10 norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified 
R groups {e.g., norleucine) or modified peptide backbones, but retain the same basic chemical 
structure as a naturally occurring amino acid. Amino acid mimetics refers to chemical 
compounds that have a structure that is different from the general chemical structure of an 
amino acid, but that functions in a manner similar to a naturally occurring amino acid. 

15 [0080] "Protein", "polypeptide", or "peptide" refer to a polymer in which the monomers are 
amino acids and are joined together through amide bonds, alternatively referred to as a 
polypeptide. When the amino acids are a-amino acids, either the L-optical isomer or the D- 
optical isomer can be used. Additionally, unnatural amino acids, for example, P-alanine, 
phenylglycine and homoarginine are also included. Amino acids that are not gene-encoded 

20 may also be used in the present invention. Furthermore, amino acids that have been modified 
to include reactive groups may also be used in the invention. All of the amino acids used in 
the present invention may be either the D - or L -isomer. The L -isomers are generally 
preferred. In addition, other peptidomimetics are also useful in the present invention. For a 
general review, see, Spatola, A. F., in Chemistry and Biochemistry of Amino Acids, 

25 Peptides and Proteins, B. Weinstein, eds., Marcel Dekker, New York, p. 267 (1983). 

[0081] The term "recombinant" when used with reference to a cell indicates that the cell 
replicates a heterologous nucleic acid, or expresses a peptide or protein encoded by a 
heterologous nucleic acid. Recombinant cells can contain genes that are not found within the 
native (non-recombinant) form of the cell. Recombinant cells can also contain genes found 
30 in the native form of the cell wherein the genes are modified and re-introduced into the cell 
by artificial means. The term also encompasses cells that contain a nucleic acid endogenous 
to the cell that has been modified without removing the nucleic acid from the cell; such 
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modifications include those obtained by gene replacement, site-specific mutation, and related 
techniques. A "recombinant protein" is one which has been produced by a recombinant cell. 

[0082] A "fusion protein" refers to a protein comprising amino acid sequences that are in 
addition to, in place of, less than, and/or different from the amino acid sequences encoding 
5 the original or native full-length protein or subsequences thereof. 

[0083] Components of fusion proteins include "accessory enzymes" and/or "purification 
tags." An "accessory enzyme" as referred to herein, is an enzyme that is involved in 
catalyzing a reaction that, for example, forms a substrate for a glycosyltransferase. An 
accessory enzyme can, for example, catalyze the formation of a nucleotide sugar that is used 

10 as a donor moiety by a glycosyltransferase. An accessory enzyme can also be one that is used 
in the generation of a nucleotide triphosphate required for formation of a nucleotide sugar, or 
in the generation of the sugar which is incorporated into the nucleotide sugar. The 
recombinant fusion protein of the invention can be constructed and expressed as a fusion 
protein with a molecular "purification tag" at one end, which facilitates purification of the 

15 protein. Such tags can also be used for immobilization of a protein of interest during the 
glycosylation reaction. Suitable tags include "epitope tags," which are a protein sequence 
that is specifically recognized by an antibody. Epitope tags are generally incorporated into 
fusion proteins to enable the use of a readily available antibody to unambiguously detect or 
isolate the fusion protein. A "FLAG tag" is a commonly used epitope tag, specifically 

20 recognized by a monoclonal anti-FLAG antibody, consisting of the sequence 

AspTyrLysAspAspAsp AspLys or a substantially identical variant thereof. Other suitable 
tags are known to those of skill in the art, and include, for example, an affinity tag such as a 
hexahistidine peptide, which will bind to metal ions such as nickel or cobalt ions. Proteins 
comprising purification tags can be purified using a binding partner that binds the purification 

25 tag, e.g., antibodies to the purification tag, nickel or cobalt ions or resins, and amylose, 

maltose, or a cyclodextrin. Purification tags also include maltose binding domains and starch 
binding domains. Purification of maltose binding domain proteins is known to those of skill 
in the art. Starch binding domains are described in WO 99/15636, herein incorporated by 
reference. Affinity purification of a fusion protein comprising a starch binding domain using 

30 a betacylodextrin (BCD)-derivatized resin is described in USSN 60/468,374, filed May 5, 
2003, herein incorporated by reference in its entirety. 
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[0084] The term "functional domain" with reference to glycosyltransferases, refers to a 
domain of the glycosyltransferase that confers or modulates an activity of the enzyme, e.g., 
acceptor substrate specificity, catalytic activity, binding affinity, localization within the Golgi 
apparatus, anchoring to a cell membrane, or other biological or biochemical activity. 
5 Examples of functional domains of glycosyltransferases include, but are not limited to, the 
catalytic domain, stem region, and signal-anchor domain. 

[0085] The terms "expression level" or "level of expression" with reference to a protein 
refers to the amount of a protein produced by a cell. The amount of protein produced by a 
cell can be measured by the assays and activity units described herein or known to one skilled 

10 in the art. One skilled in the art would know how to measure and describe the amount of 
protein produced by a cell using a variety of assays and units, respectively. Thus, the 
quantitation and quantitative description of the level of expression of a protein, e.g., a 
glycosyltransferase, is not limited to the assays used to measure the activity or the units used 
to describe the activity, respectively. The amount of protein produced by a cell can be 

1 5 determined by standard known assays, for example, the protein assay by Bradford (1 976), the 
bicinchoninic acid protein assay kit from Pierce (Rockford, Illinois), or as described in U.S. 
Patent No. 5,641,668. 

[0086] The term "enzymatic activity" refers to an activity of an enzyme and may be 
measured by the assays and units described herein or known to one skilled in the art. 
20 Examples of an activity of a glycosyltransferase include, but are not limited to, those 

associated with the functional domains of the enzyme, e.g., acceptor substrate specificity, 
catalytic activity, binding affinity, localization within the Golgi apparatus, anchoring to a cell 
membrane, or other biological or biochemical activity. 

[0087] A "stem region" with reference to glycosyltransferases refers to a protein domain, or 
25 a subsequence thereof, which in the native glycosyltransferases is located adjacent to the 

trans-membrane domain, and has been reported to function as a retention signal to maintain 
the glycosyltransferase in the Golgi apparatus and as a site of proteolytic cleavage. 
Exemplary stem regions include, but is not limited to, the stem region of fucosyltransferase 
VI, amino acid residues 40-54; the stem region of mammalian GnTl, amino acid residues 
30 from about 36 to about 103 (see, e.g., the human enzyme); the stem region of mammalian 

GalTl, amino acid residues from about 71 to about 129 (see e.g., the bovine enzyme); and the 
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stem region of mammalian ST3GalIII, amino acid residues from about 29 to about 84 (see, 
e.g., the rat enzyme). 

[0088] A "catalytic domain" refers to a protein domain, or a subsequence thereof, that 
catalyzes an enzymatic reaction performed by the enzyme. For example, a catalytic domain 
of a sialyltransferase will include a subsequence of the sialyltransferase sufficient to transfer 
a sialic acid residue from a donor to an acceptor saccharide. A catalytic domain can include 
an entire enzyme, a subsequence thereof, or can include additional amino acid sequences that 
are not attached to the enzyme, or a subsequence thereof, as found in nature. An exemplary 
catalytic region is, but is not limited to, the catalytic domain of fucosyltransferase VII, amino 
acid residues 39-342; the catalytic domain of mammalian GnTl , amino acid residues from 
about 104 to about 445 (see, e.g., the human enzyme); the catalytic domain of mammalian 
GalTl, amino acid residues from about 130 to about 402 (see e.g., the bovine enzyme); and 
the catalytic domain of mammalian ST3Gaim, amino acid residues from about 85 to about 
374 (see, e.g., the rat enzyme). Catalytic domains and truncation mutants of GalNAcT2 
proteins are described in USSN 60/576,530 filed June 3, 2004; and US provisional patent 
application Attorney Docket Number 040853-01 -5 149-P1, filed August 3, 2004; both of 
which are herein incorporated by reference for all purposes. 

[0089] A "subsequence" refers to a sequence of nucleic acids or amino acids that comprise 
a part of a longer sequence of nucleic acids or amino acids (e.g., protein) respectively. 

[0090] A "glycosyltransferase truncation" or a "truncated glycosyltransferase" or 
grammatical variants, refer to a glycosyltransferase that has fewer amino acid residues than a 
naturally occurring glycosyltransferase, but that retains enzymatic activity. Truncated 
glycosyltransferases include, e.g., truncated GnTl enzymes, truncated GalTl enzymes, 
truncated ST3Galin enzymes, and truncated GalNAcT2 enzymes. Any number of amino 
acid residues can be deleted so long as the enzyme retains activity. In some embodiments, 
domains or portions of domains can be deleted, e.g., a signal-anchor domain can be deleted 
leaving a truncation comprising a stem region and a catalytic domain; or a signal-anchor 
domain and a stem region can be deleted leaving a truncation comprising catalytic domain. 

[0091] The term "nucleic acid" refers to a deoxyribonucleotide or ribonucleotide polymer 
in either single-or double-stranded form, and unless otherwise limited, encompasses known 
analogues of natural nucleotides that hybridize to nucleic acids in a manner similar to 
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naturally occurring nucleotides. Unless otherwise indicated, a particular nucleic acid 
sequence includes the complementary sequence thereof. 

[0092] A "recombinant expression cassette" or simply an "expression cassette" is a nucleic 
acid construct, generated recombinantly or synthetically, with nucleic acid elements that are 
5 capable of affecting expression of a structural gene in hosts compatible with such sequences. 
Expression cassettes include at least promoters and optionally, transcription termination 
signals. Typically, the recombinant expression cassette includes a nucleic acid to be 
transcribed (e.g., a nucleic acid encoding a desired polypeptide), and a promoter. Additional 
factors necessary or helpful in effecting expression may also be used as described herein. For 
10 example, an expression cassette can also include nucleotide sequences that encode a signal 
sequence that directs secretion of an expressed protein from the host cell. Transcription 
termination signals, enhancers, and other nucleic acid sequences that influence gene 
expression, can also be included in an expression cassette. 

[0093] A "heterologous sequence" or a "heterologous nucleic acid", as used herein, is one 
15 that originates from a source foreign to the particular host cell, or, if from the same source, is 
modified from its original form. Thus, a heterologous glycoprotein gene in a eukaryotic host 
cell includes a glycoprotein-encoding gene that is endogenous to the particular host cell that 
has been modified. Modification of the heterologous sequence may occur, e.g., by treating 
the DNA with a restriction enzyme to generate a DNA fragment that is capable of being 
20 operably linked to the promoter. Techniques such as site-directed mutagenesis are also useful 
for modifying a heterologous sequence. 

[0094] The term "isolated" refers to material that is substantially or essentially free from 
components which interfere with the activity of an enzyme. For a saccharide, protein, or 
nucleic acid of the invention, the term "isolated" refers to material that is substantially or 

25 essentially free from components which normally accompany the material as found in its 
native state. Typically, an isolated saccharide, protein, or nucleic acid of the invention is at 
least about 80% pure, usually at least about 90%, and preferably at least about 95% pure as 
measured by band intensity on a silver stained gel or other method for determining purity. 
Purity or homogeneity can be indicated by a number of means well known in the art. For 

30 example, a protein or nucleic acid in a sample can be resolved by polyacrylamide gel 

electrophoresis, and then the protein or nucleic acid can be visualized by staining. For certain 
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purposes high resolution of the protein or nucleic acid may be desirable and HPLC or a 
similar means for purification, for example, may be utilized. 

[0095] The term "operably linked" refers to functional linkage between a nucleic acid 
expression control sequence (such as a promoter, signal sequence, or array of transcription 
5 factor binding sites) and a second nucleic acid sequence, wherein the expression control 
sequence affects transcription and/or translation of the nucleic acid corresponding to the 
second sequence. 

[0096] The terms "identical" or percent "identity," in the context of two or more nucleic 
acids or protein sequences, refer to two or more sequences or subsequences that are the same 
10 or have a specified percentage of amino acid residues or nucleotides that are the same, when 
compared and aligned for maximum correspondence, as measured using one of the following 
sequence comparison algorithms or by visual inspection. 

[0097] The phrase "substantially identical," in the context of two nucleic acids or proteins, 
refers to two or more sequences or subsequences that have at least greater than about 60% 

15 nucleic acid or amino acid sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 
91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% nucleotide or amino acid residue 
identity, when compared and aligned for maximum correspondence, as measured using one 
of the following sequence comparison algorithms or by visual inspection. Preferably, the 
substantial identity exists over a region of the sequences that is at least about 50 residues in 

20 length, more preferably over a region of at least about 100 residues, and most preferably the 
sequences are substantially identical over at least about 150 residues. In a most preferred 
embodiment, the sequences are substantially identical over the entire length of the coding 
regions. 

[0098] For sequence comparison, typically one sequence acts as a reference sequence, to 
25 which test sequences are compared. When using a sequence comparison algorithm, test and 
reference sequences are input into a computer, subsequence coordinates are designated, if 
necessary, and sequence algorithm program parameters are designated. The sequence 
comparison algorithm then calculates the percent sequence identity for the test sequence(s) 
relative to the reference sequence, based on the designated program parameters. 

30 [0099] Optimal alignment of sequences for comparison can be conducted, e.g., by the local 
homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482 (1981), by the homology 
alignment algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970), by the search for 
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similarity method of Pearson & Lipman, Proc. Nat'l. Acad. Sci. USA 85:2444 (1988), by 
computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA 
in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., 
Madison, WI), or by visual inspection (see generally, Current Protocols in Molecular 
5 Biology, F.M. Ausubel et al. , eds., Current Protocols, a joint venture between Greene 
Publishing Associates, Inc. and John Wiley & Sons, Inc., (1995 Supplement) (Ausubel)). 

[0100] Examples of algorithms that are suitable for determining percent sequence identity 
and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in 
Altschul et al. (1990) J. Mol. Biol. 215: 403-410 and Altschuel et al. (1977) Nucleic Acids 

10 Res. 25: 3389-3402, respectively. Software for performing BLAST analyses is publicly 
available through the National Center for Biotechnology Information 
(www.ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring sequence 
pairs (HSPs) by identifying short words of length W in the query sequence, which either 
match or satisfy some positive- valued threshold score T when aligned with a word of the 

15 same length in a database sequence. T is referred to as the neighborhood word score 
threshold (Altschul et al, supra). These initial neighborhood word hits act as seeds for 
initiating searches to find longer HSPs containing them. The word hits are then extended in 
both directions along each sequence for as far as the cumulative alignment score can be 
increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters 

20 M (reward score for a pair of matching residues; always > 0) and N (penalty score for 

mismatching residues; always < 0). For amino acid sequences, a scoring matrix is used to 
calculate the cumulative score. Extension of the word hits in each direction are halted when: 
the cumulative alignment score falls off by the quantity X from its maximum achieved value; 
the cumulative score goes to zero or below, due to the accumulation of one or more negative- 

25 scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm 
parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN 
program (for nucleotide sequences) uses as defaults a wordlength (W) of 1 1, an expectation 
(E) of 10, M=5, N=-4, and a comparison of both strands. For amino acid sequences, the 
BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the 

30 BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Nail. Acad. Sci. USA 89:10915 
(1989)). 

[0101] In addition to calculating percent sequence identity, the BLAST algorithm also 
performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & 
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Altschul, Proc. Nat 'I. Acad. Sci. USA 90:5873-5787(1993)). One measure of similarity 
provided by the BLAST algorithm is the smallest sum probability (P(N))> which provides an 
indication of the probability by which a match between two nucleotide or amino acid 
sequences would occur by chance. For example, a nucleic acid is considered similar to a 
5 reference sequence if the smallest sum probability in a comparison of the test nucleic acid to 
the reference nucleic acid is less than about 0.1, more preferably less than about 0.01, and 
most preferably less than about 0.001. 

[0102] A further indication that two nucleic acid sequences or proteins are substantially 
identical is that the protein encoded by the first nucleic acid is immunologically cross reactive 
10 with the protein encoded by the second nucleic acid, as described below. Thus, a protein is 
typically substantially identical to a second protein, for example, where the two peptides 
differ only by conservative substitutions. Another indication that two nucleic acid sequences 
are substantially identical is that the two molecules hybridize to each other under stringent 
conditions, as described below. 

15 [0103] The phrase "hybridizing specifically to" refers to the binding, duplexing, or 

hybridizing of a molecule only to a particular nucleotide sequence under stringent conditions 
when that sequence is present in a complex mixture {e.g., total cellular) DNA or RNA. 

[0104] The term "stringent conditions" refers to conditions under which a probe will 
hybridize to its target subsequence, but to no other sequences. Stringent conditions are 

20 sequence-dependent and will be different in different circumstances. Longer sequences 

hybridize specifically at higher temperatures. Generally, stringent conditions are selected to 
be about 15°C lower than the thermal melting point (Tm) for the specific sequence at a 
defined ionic strength and pH. The Tm is the temperature (under defined ionic strength, pH, 
and nucleic acid concentration) at which 50% of the probes complementary to the target 

25 sequence hybridize to the target sequence at equilibrium. (As the target sequences are 
generally present in excess, at Tm, 50% of the probes are occupied at equilibrium). 
Typically, stringent conditions will be those in which the salt concentration is less than about 
1.0 M Na ion, typically about 0.01 to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 
8.3 and the temperature is at least about 30°C for short probes {e.g., 10 to 50 nucleotides) and 

30 at least about 60°C for long probes {e.g., greater than 50 nucleotides). Stringent conditions 
may also be achieved with the addition of destabilizing agents such as formamide. For 
selective or specific hybridization, a positive signal is typically at least two times 
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background, preferably 10 times background hybridization. Exemplary stringent 
hybridization conditions can be as following: 50% formamide, 5x SSC, and 1% SDS, 
incubating at 42° C, or, 5x SSC, 1% SDS, incubating at 65° C, with wash in 0.2x SSC, and 
0.1% SDS at 65° C. For PCR, a temperature of about 36° C is typical for low stringency 
5 amplification, although annealing temperatures may vary between about 32-48° C depending 
on primer length. For high stringency PCR amplification, a temperature of about 62° C is 
typical, although high stringency annealing temperatures can range from about 50° C to about 
65° C, depending on the primer length and specificity. Typical cycle conditions for both high 
and low stringency amplifications include a denaturation phase of 90-95° C for 30-120 sec, 
10 an annealing phase lasting 30-120 sec, and an extension phase of about 72° C for 1-2 min. 
Protocols and guidelines for low and high stringency amplification reactions are available, 
e.g., in Innis, et al. (1990) PCR Protocols: A Guide to Methods and Applications Academic 
Press, N.Y. 

[0105] The phrases "specifically binds to a protein" or "specifically immunoreactive with", 
15 when referring to an antibody refers to a binding reaction which is determinative of the 

presence of the protein in the presence of a heterogeneous population of proteins and other 
biologies. Thus, under designated immunoassay conditions, the specified antibodies bind 
preferentially to a particular protein and do not bind in a significant amount to other proteins 
present in the sample. Specific binding to a protein under such conditions requires an 
20 antibody that is selected for its specificity for a particular protein. A variety of immunoassay 
formats may be used to select antibodies specifically immunoreactive with a particular 
protein. For example, solid-phase ELIS A immunoassays are routinely used to select 
monoclonal antibodies specifically immunoreactive with a protein. See Harlow and Lane 
(1988) Antibodies, A Laboratory Manual, Cold Spring Harbor Publications, New York, for a 
25 description of immunoassay formats and conditions that can be used to determine specific 
immunoreactivity. 

[0106] "Conservatively modified variations" of a particular polynucleotide sequence refers 
to those polynucleotides that encode identical or essentially identical amino acid sequences, 
or where the polynucleotide does not encode an amino acid sequence, to essentially identical 
30 sequences. Because of the degeneracy of the genetic code, a large number of functionally 

identical nucleic acids encode any given protein. For instance, the codons CGU, CGC, CGA, 
CGG, AGA, and AGG all encode the amino acid arginine. Thus, at every position where an 
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arginine is specified by a codon, the codon can be altered to any of the corresponding codons 
described without altering the encoded protein. Such nucleic acid variations are "silent 
variations," which are one species of "conservatively modified variations." Every 
polynucleotide sequence described herein which encodes a protein also describes every 
5 possible silent variation, except where otherwise noted. One of skill will recognize that each 
codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and 
UGG which is ordinarily the only codon for tryptophan) can be modified to yield a 
functionally identical molecule by standard techniques. Accordingly, each "silent variation" 
of a nucleic acid which encodes a protein is implicit in each described sequence. 

10 [0107] Furthermore, one of skill will recognize that individual substitutions, deletions or 
additions which alter, add or delete a single amino acid or a small percentage of amino acids 
(typically less than 5%, more typically less than 1%) in an encoded sequence are 
"conservatively modified variations" where the alterations result in the substitution of an 
amino acid with a chemically similar amino acid. Conservative substitution tables providing 

15 functionally similar amino acids are well known in the art. 

[0108] One of skill will appreciate that many conservative variations of proteins, e.g., 
glycosyltransferases, and nucleic acid which encode proteins yield essentially identical 
products. For example, due to the degeneracy of the genetic code, "silent substitutions" (i.e., 
substitutions of a nucleic acid sequence which do not result in an alteration in an encoded 

20 protein) are an implied feature of every nucleic acid sequence which encodes an amino acid. 
As described herein, sequences are preferably optimized for expression in a particular host 
cell used to produce the chimeric glycosyltransferases (e.g., yeast, human, and the like). 
Similarly, "conservative amino acid substitutions," in one or a few amino acids in an amino 
acid sequence are substituted with different amino acids with highly similar properties (see, 

25 the definitions section, supra), are also readily identified as being highly similar to a 

particular amino acid sequence, or to a particular nucleic acid sequence which encodes an 
amino acid. Such conservatively substituted variations of any particular sequence are a 
feature of the present invention. See also, Creighton (1984) Proteins, W.H. Freeman and 
Company. In addition, individual substitutions, deletions or additions which alter, add or 

30 delete a single amino acid or a small percentage of amino acids in an encoded sequence are 
also "conservatively modified variations". 
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[0109] The practice of this invention can involve the construction of recombinant nucleic 
acids and the expression of genes in host cells, preferably bacterial host cells. Molecular 
cloning techniques to achieve these ends are known in the art. A wide variety of cloning and 
in vitro amplification methods suitable for the construction of recombinant nucleic acids such 
5 as expression vectors are well known to persons of skill. Examples of these techniques and 
instructions sufficient to direct persons of skill through many cloning exercises are found in 
Berger and Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology 
volume 152 Academic Press, Inc., San Diego, CA (Berger); and Current Protocols in 
Molecular Biology, F.M. Ausubel et al, eds., Current Protocols, a joint venture between 
10 Greene Publishing Associates, Inc. and John Wiley & Sons, Inc., (1999 Supplement) 

(Ausubel). Suitable host cells for expression of the recombinant polypeptides are known to 
those of skill in the art, and include, for example, prokaryotic cells, such as E. coli, and 
eukaryotic cells including insect, mammalian and fungal cells (e.g., Aspergillus niger) 

[0110] Examples of protocols sufficient to direct persons of skill through in vitro 
15 amplification methods, including the polymerase chain reaction (PCR) the ligase chain 

reaction (LCR), QP-replicase amplification and other RNA polymerase mediated techniques 
are found in Berger, Sambrook, and Ausubel, as well as Mullis et al. (1987) U.S. Patent No. 
4,683,202; PCR Protocols A Guide to Methods and Applications (Innis et al. eds) Academic 
Press Inc. San Diego, CA (1990) (Innis); Amheim & Levinson (October 1, 1990) C&EN 36- 
20 47; The Journal Of NIH Research (1991) 3: 81-94; (Kwoh et al. (1989) Proc. Natl. Acad. Sci. 
USA 86: 1 173; Guatelli et al. (1990) Proc. Natl. Acad. Sci. USA 87: 1874; Lomell et al. 

(1989) J. Clin. Chem. 35: 1826; Landegren et al. (1988) Science 241: 1077-1080; Van Brunt 

(1990) Biotechnology 8: 291-294; Wu and Wallace (1989) Gene 4: 560; and Barringer et al. 
(1990) Gene 89: 1 17. Improved methods of cloning in vitro amplified nucleic acids are 

25 described in Wallace et al, U.S. Pat. No. 5,426,039. 

DETAILED DESCRIPTION OF THE INVENTION 
I. Introduction 

[0111] The present invention provides conditions for refolding eukaryotic 
glycosyltransferases that are expressed as insoluble proteins in bacterial inclusion bodies. 
30 Refolding buffers comprising redox couples are used to enhance refolding of insoluble 
eukaryotic glycosyltransferases. For some insoluble eukaryotic glycosyltransferases, 
refolding can be enhanced by site directed mutagenesis to remove unpaired cysteines. The 
invention also provides methods to refold more than one glycosyltransferase in a single 
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vessel, thereby enhancing refolding of the proteins and increasing efficiency of protein 
production. The refolded eukaryotic glycosyltransferases can be used to produce or to 
remodel polysaccharides, oligosaccharides, glycolipids, and glycoproteins. The refolded 
eukaryotic glycosyltransferases can also be used to glycoPEGylate glycoproteins as described 
5 in PCT/US02/32263, which is herein incorporated by reference for all purposes. 

II. Refolding insoluble glycosyltransferases 

[0112] Many recombinant proteins expressed in bacteria are expressed as insoluble 
aggregates in bacterial inclusion bodies. Inclusion bodies are protein deposits found in both 
the cytoplasmic and periplasmic space of bacteria. (See, e.g., Clark, Cur. Op. Biotech. 

10 12:202-207 (2001)). Eukaryotic glycosyltransferases are frequently expressed in bacterial 
inclusion bodies. Some eukaryotic glycosyltransferases are soluble in bacteria, i.e., not 
produced in inclusion bodies, when only the catalytic domain of the protein is expressed. 
However, many eukaryotic glycosyltransferases are expressed in bacterial inclusion bodies, 
even if only the catalytic domain is expressed, and methods for refolding these proteins to 

1 5 produce active glycosyltransferases are provided herein. 

A. Conditions for refolding active glycosyltransferases 
[0113] To produce active eukaryotic glycosyltranferases from bacterial cells, eukaryotic 
glycosyltranferases are expressed in bacterial inclusion bodies, the bacteria are harvested, 
disrupted and the inclusion bodies are isolated and washed. The proteins within the inclusion 
20 bodies are then solubilized. Solubilization can be performed using denaturants, e.g. , 
guanidinium chloride or urea; extremes of pH; or detergents. 

[0114] After solubilization, denaturants are removed from the glycosyltransferase mixture. 
Denaturant removal can be done by a variety of methods, including dilution into a refolding 
buffer or buffer exchange methods. Buffer exchange methods include dialysis, diafiltration, 
25 gel filtration, and immobilization of the protein onto a solid support. (See, e.g., Clark, Cur. 
Op. Biotech. 12:202-207 (2001)). Any of the above methods can be combined to remove 
denaturants. 

[0115] Disulfide bond formation in the eukaryotic glycosyltransferase is promoted by 
addition of a refolding buffer comprising a redox couple. Redox couples include reduced and 
30 oxidized glutathione (GSH/GSSG), cysteine/cystine, cystearnme/cystamine, DTT/GSSG, and 
DTE/GSSG. (See, e.g., Clark, Cur. Op. Biotech. 12:202-207 (2001)). 
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[0116] Refolding can be performed in buffers at pH's ranging from, for example, 6.0 to 
10.0. Refolding buffers can include other additives to enhance refolding, e.g., L-arginine 
(0.4-1M); PEG; low concentrations of denaturants, such as urea (1-2M) and guanidinium 
chloride (0.5-1.5 M); and detergents (e.g., Chaps, SDS, CTAB, lauryl maltoside, and Triton 
5 X-100). 

[0117] A eukaryotic glycosyltransferase protein comprising a catalytic domain is expressed 
in bacterial inclusion bodies and then refolded using the above methods. Eukaryotic 
glycosyltransferases that comprise all or a portion of a stem region and a catalytic domain can 
also be used in the a methods described herein. 

10 [0118] Eukaryotic glycosyltransferases, including truncated eukaryotic 

glycosyltransferases, can be fused to purification tags and expressed in bacterial inclusion 
bodies and then refolded using the above methods. Purification tags include, e.g., a maltose 
binding protein (MBP) tag, a polyhistidine tag, a glutathione S transferase (GST), a starch 
binding protein (SBP), a FLAG epitope, and a myc epitope. Refolded glycosyltransferases 

15 can be further purified using a binding partner that binds to the purification tag. In a 

preferred embodiment, an MBP tag is fused to the eukaryotic glycosyltransferase to enhance 
refolding. MBP tags can be fused e.g., GnTl, GalTl, ST3Gaim, an GalNAcT2 proteins to 
enhance refolding and recovery of the active glycosyltransferases. 

[0119] Those of skill will recognize that a protein has been refolded correctly when the 
20 refolded protein has detectable biological activity. For a glycosyltransferase biological 

activity is the ability to catalyze transfer of a donor substrate to an acceptor substrate, e.g., a 
refolded ST3Gaini is able to transfer sialic acid to an acceptor substrate. Biological activity 
includes e.g., specific activities of at least 1, 2, 5, 7, or 10 units of activity. Unit is defined as 
follows: one activity unit catalyzes the formation of 1 umol of product per minute at a given 
25 temperature (e.g., at 37°C) and pH value (e.g., at pH 7.5). Thus, 10 units of an enzyme is a 
catalytic amount of that enzyme where 10 umol of substrate are converted to 10 umol of 
product in one minute at a temperature of, e.g., 37 °C and a pH value of, e.g., 7.5. 

[0120] In one embodiment, rat liver iV-acetyllactosaminide or-2,3-sialyltransferase 
(ST3GalHI) is expressed in bacterial inclusion bodies, solubilized, and refolded in a buffer 
30 comprising a redox couple, e.g. , GSH/GSSG or cystamine/cysteine. 



26 



[0121] In another embodiment, human GalNAcT2 is expressed in bacterial inclusion 
bodies, solubilized, and refolded in a buffer comprising a redox couple, e.g., GSH/GSSG or 
cystamine/cysteine. 

B. Site directed mutagenesis of glycosyltransferases to enhance refolding 
5 [0122] As refolding occurs, cysteine residues in a denatured protein form disulfide bonds 
that help to reproduce the structure of the active protein. Incorrect pairing of cysteine 
residues can lead to protein misfolding. Proteins with unpaired cysteine residues are 
susceptible to misfolding because a normally unpaired cysteine can form a disulfide bond 
with normally paired cysteine making correct cysteine pairing and protein refolding 
10 impossible. Thus, one method to enhance refolding of a particular glycosyltransferase is to 
identify unpaired cysteine residues and remove them. 

[0123] Unpaired cysteine residues can be identified by deterrnining the structure of the 
glycosyltransferase of interest. Protein structure can be determined based on actual data for 
the glycosyltransferase of interest, e.g., circular dichroism, NMR, and X-ray crystallography. 

15 Protein structure can also be determined using computer modeling. Computer modeling is a 
technique that can be used to model related structures based on known three-dimensional 
structures of homologous molecules. Standard software is commercially available. (See e.g., 
www.accelrys.com for the multitude of software available to do computer modeling.) Once 
an unpaired cysteine residue is identified, the DNA encoding the glycosyltransferase of 

20 interest can be mutated using standard molecular biology techniques to remove the unpaired 
cysteine, by deletion or by substitution with another amino acid residue. Computer modeling 
is used again to select an amino acid of appropriate size, shape, and charge for substitution. 
Unpaired cysteines can also be determined by peptide mapping. Once the glycosyltransferase 
of interest is mutated, the protein is expressed in bacterial inclusion bodies and refolding 

25 ability is determined. A correctly refolded glycosyltransferase will have biological activity. 

[0124] Human N-acetylglucosaminyltransferase I (GnTI, accession number NP_002397) is 
an example of a glycosyltransferase that exhibited enhanced refolding after mutagenesis of an 
unpaired cysteine. (See, e.g., Example 2, below.) Human GNTI is closely related to a 
number of eukaryotic GNTI proteins, e.g., Chinese hamster, accession number AAK61868; 
30 rabbit accession number AAA3 1493; rat accession number NP_1 10488; golden hamster, 
accession number AAD04130; mouse, accession number P27808; zebrafish, accession 
number AAH58297; Xenopus, accession number CAC51 1 19; Drosophila, accession number 
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NP_525 117; Anopheles, accession number XP_3 1 5359; C. elegans, accession number 
NP_497719; Physcomitrella patens, accession number CAD22107; Solanum tuberosum, 
accession number CAC80697; Nicotiana tabacum, accession number CAC80702; Oryza 
sativa, accession number CAD30022; Nicotiana benthamiana, accession number CAC82507; 
5 and Arabidopsis thaliana, accession number NP_1 95537. 

[0125] The structure of the rabbit AT-acetylglucosaminyltransferase I (GnTI) protein had 
been determined and showed that CYS123 was unpaired. (Amino acid residue numbers refer 
to the full length protein sequence even when a GNTI protein has been truncated.) Computer 
modeling based on the rabbit GnTI was used to determine the structure of the human GnTI 

10 protein. An alignment is shown in Figure 6. In the human GnTI protein, CYS121 was 

unpaired. Substitutions for CYS121 were made in human GnTI. A CYS121SER mutant and 
a CYS121ALA mutant were active. In contrast, a CYS121THR mutant had no detectable 
activity and a CYS121ASP mutant had low activity. A double mutant, ARG120ALA, 
CYS121HIS, was constructed based on the predicted structure of the C. elegans GNTI 

1 5 protein, and had activity. 

[0126] The amino acid sequences of the eukaryotic GnTI proteins listed above can be used 
to determine protein structure based on computer modeling and the conserved function of 
CYS123 from rabbit and CYS121 from human. Based on that analysis, residue 123 is an 
unpaired cysteine in the following proteins: Chinese hamster GnTI, the rabbit GnTI, the rat 
20 GnTI, the golden hamster GnTI, and the mouse GnTI. Thus, CYS123 can be mutated in each 
of the GnTI enzymes to serine, alanine, or arginine to produce an active protein with 
enhanced refolding activity. The following double mutant in the above proteins, 
ARG122ALA, CYS123HIS, will also exhibit enhanced refolding. 

C. One pot refolding of glycosyltransferases 

25 [0127] These embodiments of the invention are based on the surprising observation that 
multiple eukaryotic glycosyltransferases expressed in bacterial inclusion bodies can be 
refolded in a single vessel, i.e., a one pot method. Using this method at least two 
glycosyltransferases can be refolded together resulting in savings of time and materials. 
Refolding conditions are described above. The refolding conditions are optimized for the 

30 mixture of glycosyltransferases, thus, conditions may not be optimal for any particular 
enzyme in the mixture. However, because refolding is optimized for the combination of 
glycosyltransferases, each of the refolded glycosyltransferases in the end product has 
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detectable biological activity. Biological activity refers to enzymatic activity of the refolded 
enzymes and can be expressed as specific activity. Biological activity includes e.g., specific 
activities of at least 0.1, 0.5, 1, 2, 5, 7, or 10 units of activity. Unit is defined as follows: one 
activity unit catalyzes the formation of 1 umol of product per minute at a given temperature 
5 (e.g., at 37°C) and pH value (e.g., at pH 7.5). Thus, 10 units of an enzyme is a catalytic 
amount of that enzyme where 10 umol of substrate are converted to 10 umol of product in 
one minute at a temperature of, e.g., 37 °C and a pH value of, e.g., 7.5. The reaction mixture 
comprising refolded glycosyltransferases can then be used e.g., to synthesize 
oligosaccharides, to synthesize glycolipids, to remodel glycoproteins, and to glycoPEGlyate 
10 glycoproteins. 

[0128] In some embodiments, the glycosyltransferases can be solubilized individually from 
inclusion bodies and then combined under conditions appropriate for refolding. In other 
embodiments, inclusion bodies containing glycosyltransferases are combined, solubilized, 
and then refolded under appropriate conditions. 

15 [0129] Refolding buffers typically include a redox couple. Refolding can be performed at 
pH's ranging from, for example, 6.0 to 10.0. Refolding buffers can include other additives to 
enhance refolding, e.g., L-arginine (0.4- 1M); PEG; low concentrations of denaturants, such 
as urea (1-2M) and guanidinium chloride (0.5-1.5 M); and detergents (e.g., Chaps, SDS, 
CTAB, and Triton X-100). 

20 [0130] In some embodiments, refolding is performed in a stationary vessel, i.e., without 
mixing, stirring, shaking or otherwise moving the reaction mixture. 

[0131] The combination of refolded enzymes can include enzymes to construct a particular 
oligosaccharide structure. Those of skill will be able to identify appropriate 
glycosyltranferases for inclusion in the mixture once a desired end product is identified. 

25 [0132] The reaction mixtures of refolded enzymes can include glycosyltranferases that 
have been mutated to enhance refolding, e.g., the GnTI enzymes described above. 

[0133] In a preferred embodiment, enzymes that perform N-linked glycosylation steps are 
refolded together in a single vessel. For example, A^acetylglucosarninyltransferase I (GnTI), 
/S-1,4 galactosyltransferase I (Gal TI), and AT-acetyllactosaminide a-2,3-sialyltransferase 
30 (ST3GaHH) can be expressed in bacterial inclusion bodies, solubilized, and refolded together 
in a single vessel. The end product exhibited activity of all three proteins, indicating they 
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were all correctly refolded. Refolding also occurred when GnTI and Gal TI were refolded 
together without ST3GalIII. The experiments are described in detail in Example 3. 



[0134] In another preferred embodiment, O-linked glycosylation of a peptide or protein is 
accomplished using the bacterially expressed and refolded glycosyltransferases of this 
5 disclosure. For example, a refolded MBP-GalNAcT2(D5 1) enzyme can be used to add 
GalNAc to polypeptides. E.g., example 4 provides a demonstration that refolded MBP- 
GalNAcT2(D5 1) can be used to add GalNAc to the GCSF protein. 

III. Glycosyltransferases 

[0135] The glycosyltransferases of use in practicing the present invention are eukaryotic 
10 glycosyltransferases. Examples of such glycosyltransferases include those described in 

Staudacher, E. (1996) Trends in Glycoscience and Glycotechnology, 8: 391-408, afmb.cnrs- 
mrs.fr/~pedro/CAZY/gtf.html and www.vei.co.uk/TGN/gt_guide.htm, but are not limited 
thereto. 

Eukaryotic glycosyltransferases 

15 [0136] Some eukaryotic glycosyltransferases have topological domains at their amino 
terminus that are not required for catalytic activity {see, US Patent No. 5, 032,519). Of the 
glycosyltransferases characterized to date, the "cytoplasmic domain," is most commonly 
between about 1 and about 10 amino acids in length, and is the most amino-terminal domain; 
the adjacent domain, termed the "signal-anchor domain," is generally between about 10-26 

20 amino acids in length; adjacent to the signal-anchor domain is a "stem region," which is 

generally between about 20 and about 60 amino acids in length, and known to function as a 
retention signal to maintain the glycosyltransferase in the Golgi apparatus; and at the 
carboxyl side of the stem region is the catalytic domain. 

[0137] Many mammalian glycosyltransferases have been cloned and expressed and the 
25 recombinant proteins have been characterized in terms of donor and acceptor substrate 

specificity and they have also been investigated through site directed mutagenesis in attempts 
to define residues or domains involved in either donor or acceptor substrate specificity (Aoki 
etal. (1990) EMBO. J. 9: 3171-3178; Harduin-Lepers etal. (1995) Glycobiology 5(8): 741- 
758; Natsuka and Lowe (1994) Current Opinion in Structural Biology 4: 683-691; Zu et al. 
30 (1995) Biochem. Biophys. Res. Comm. 206(1): 362-369; Seto et al. (1995) Eur. J. Biochem. 
234: 323-328; Seto etal. (1997)/. Biol. Chem. 272: 14133-141388). 
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[0138] In one group of embodiments, a functional domain of the recombinant 
glycosyltransferase proteins of the present invention is obtained from a known 
sialyltransferase. Examples of sialyltransferases that are suitable for use in the present 
invention include, but are not limited to, ST3GalIII, ST3Gal IV, ST3Gal I, ST6Gal I, ST3Gal 
5 V, ST6Gal II, ST6GalNAc I, ST6GalNAc II, and ST6GalNAc HI (the sialyltransferase 
nomenclature used herein is as described in Tsuji et al. (1996) Glycobiology 6: v-xiv). An 
exemplary a2,3-sialyltransferase (EC 2.4.99.6) transfers sialic acid to the non-reducing 
terminal Gal of a Galpl->4GlcNAc disaccharide or glycoside. See, Van den Eijnden et al, J. 
Biol. Chem., 256:3159 (1981), Weinstein et al, J. Biol. Chem., 257:13845 (1982) and Wen et 

10 al, J. Biol. Chem., 267:2101 1 (1992). Another exemplary a2,3-sialyltransferase (EC 
2.4.99.4) transfers sialic acid to the non-reducing terminal Gal of a Gaipi-»3GalNAc 
disaccharide or glycoside. See, Rearick et al, J. Biol. Chem., 254: 4444 (1979) and Gillespie 
et al, J. Biol. Chem., 267:21004 (1992). Further exemplary enzymes include Gal-P-1,4- 
GlcNAc a-2,6 sialyltransferase {See, Kurosawa et al Eur. J. Biochem. 219: 375-381 (1994)). 

15 Sialyltransferase nomenclature is described in Tsuji, S. et al. (1996) Glycobiology 6:v-vii. 

[0139] An example of a sialyltransferase that is useful in the claimed methods is ST3Gaim, 
which is also referred to as ct(2,3)sialyltransferase (EC 2.4.99.6). This enzyme catalyzes the 
transfer of sialic acid to the Gal of a Galpl,3GlcNAc, Galpl,3GalNAc or Galpl,4GlcNAc 
glycoside {see, e.g., Wen et al. (1992) J. Biol Chem. 267: 21011; Van den Eijnden et al 

20 (1991) J. Biol. Chem. 256: 3159). The sialic acid is linked to a Gal with the formation of an 
a-linkage between the two saccharides. Bonding (linkage) between the saccharides is 
between the 2-position of NeuAc and the 3-position of Gal. This particular enzyme can be 
isolated from rat liver (Weinstein et al. (1982)7. Biol. Chem. 257: 13845); the human cDNA 
(Sasaki et al. (1993) J. Biol. Chem. 268: 22782-22787; Kitagawa & Paulson (1994) J. Biol. 

25 Chem. 269: 1394-1401) and genomic (Kitagawa et al. (1996) J. Biol. Chem. 271 : 93 1-938) 
DNA sequences are kno wn, facilitating production of this enzyme by recombinant 
expression. In a preferred embodiment, the claimed sialylation methods use a rat ST3Gaim. 
Rat ST3GalHI has been cloned and and the sequence is known. See, e.g., Wen et al, J. Biol. 
Chem. 267:2101 1-21019 (1992) and Accession number M97754. 

30 [0140] In another group of embodiments, a functional domain of the recombinant 

glycosyltransferase proteins of the present inventions is obtained from a fucosyltransferase. 
A number of fucosyltransferases are known to those of skill in the art. Briefly, 
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fucosyltransferases include any of those enzymes which transfer L-fucose from GDP-fucose 
to a hydroxy position of an acceptor sugar. In some embodiments, for example, the acceptor 
sugar is a GlcNAc in a Gaip(l-»4)GlcNAc group in an oligosaccharide glycoside. Suitable 
fucosyltransferases for this reaction include the known Gaip (l-»3,4)GlcNAc 
5 a(l ->-3,4)fucosyltransferase (FTIII, E.C. No. 2.4. 1 .65) which is obtained from human milk 
(see, Palcic, et al, Carbohydrate Res. 190:1-1 1 (1989); Prieels, et al, J. Biol. Chem. 256: 
10456-10463 (1981); and Nunez, et al, Can. J. Chem. 59: 2086-2095 (1981)) and the 
Gaip(l->4)GlcNAc a(l-*3)fucosyltransferases (FTIV, FTV, and FTVI, E.C. No. 2.4.1.65) 
and NeuAca(2,3)pGal(l->4)pGlcNAc a(l->3)fucosyltransferases (FTVII) which are found 

10 in human serum. Also, available is the a 1,3 fucosyltransferase DC (nucleotide sequences of 
human and mouse FTIX) as described in Kaneko et al. (1999) FEB S Lett. 452: 237-242. In 
addition, a recombinant form of Gaip (l-*3,4)GlcNAc a(l-*3,4)fucosyltransferase is 
available (see, Dumas, et al., Bioorg. Med. Letters 1:425-428 (1991) and Kukowska-Latallo, 
et al, Genes and Development 4:1288-1303 (1990)). Other exemplary fucosyltransferases 

15 include al,2 fucosyltransferase (E.C. No. 2.4.1.69). Enzymatic fucosylation can be carried 
out by the methods described in Mollicone, et al, Eur. J. Biochem. 1 9 1 : 1 69- 1 76 ( 1 990) or 
U.S. Patent No. 5,374,655. 

[0141] In another group of embodiments, a functional domain of the recombinant 
glycosyltransferase proteins of the present inventions is obtained from known 

20 galactosyltransferases. Exemplary galactosyltransferases include j3-l,4 galactosyltransferase 
I, ocl,3- galactosyltransferases (E.C. No. 2.4.1.151, see, e.g., Dabkowski et al, Transplant 
Proc. 25:2921 (1993) and Yamamoto et al. Nature 345:229-233 (1990), bovine (GenBank 
j04989, Joziasse et al. (1989) J. Biol. Chem. 264:14290-14297), murine (GenBank m26925; 
Larsen et al. (1989) Proc. Nat'l. Acad. Sci. USA 86:8227-8231), porcine (GenBank L36152; 

25 Strahanef al (1995) Immunogenetics 41:101-105)). Another suitable a 1,3- 

galactosyltransferase is that which is involved in synthesis of the blood group B antigen (EC 
2.4.1.37, Yamamoto et al. (1990) J. Biol. Chem. 265:1 146-1 151 (human)). Also suitable for 
use in the fusion proteins of the invention are al,4-galactosyltransferases, which include, for 
example, EC 2.4.1.90 (LacNAc synthetase) and EC 2.4.1.22 (lactose synthetase) (bovine 

30 (D'Agostaro et al (1989) Eur. J. Biochem. 183:21 1-217), human (Masri et al. (1988) 

Biochem. Biophys. Res. Commun. 157:657-663), murine (Nakazawa et al (1988) J. Biochem. 
104:165-168), as well as E.C. 2.4.1.38 and the ceramide galactosyltransferase (EC 2.4.1.45, 
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Stahl etal. (1994) J. Neurosci. Res. 38:234-242). Other suitable galactosyltransferases 
include, for example, al,2-galactosyltransferases (from e.g., Schizosaccharomyces pombe, 
Chapell et al (1994) Mol Biol. Cell 5:519-528). 

[0142] Other glycosyltransferases that are useful in the recombinant fusion proteins of the 
5 present invention have been described in detail, as for the sialyltransferases, 

galactosyltransferases, and fucosyltransferases. In particular, the glycosyltransferase can also 
be, for instance, a glucosyltransferase, e.g., Alg8 (Stagljov et al, Proc. Natl. Acad. Sci. USA 
91:5977 (1994)) or Alg5 (Heesen et al. Eur. J. Biochem. 224:71 (1994)), N- 
acetylgalactosaminyltransferases such as, for example, p(l,3)-/V- 

1 0 acetylgalactosaminyltransferase, P( 1 ,4)-/V-acetylgalactosaminyltransferases (US Patent No. 
5,691,180, Nagata etal. J. Biol. Chent. 267:12082-12089 (1992), and Smith et al. J. Biol 
Chem. 269:15162 (1994)) and protem/Y-acetylgalactosaminyltransferase (Homa et al. J. Biol 
Chem. 268:12609 (1993)). Suitable Af-acetylglucosaminyltransferases include GnTI 
(2.4.1.101, Hull et al, BBRC 176:608 (1991)), GnTII, and GnTEI (Ihara et al J. Biochem. 

15 1 13:692 (1993)), GnTV (Shoreiban et al. J. Biol Chem. 268: 15381 (1993)), O-linked N- 

acetylglucosaminyltransferase (Bierhuizen et al. Proc. Natl. Acad. Sci. USA 89:9326 (1992)), 
N-acetylglucosamine-1 -phosphate transferase (Rajput et al. Biochem .7.285:985 (1992), and 
hyaluronan synthase. Also of interest are enzymes involved in proteoglycan synthesis, such 
as, for example, /Y-acetylgalactosaminyltransferase I (EC 2.4.1.174), and enzymes involved 

20 in chondroitin sulfate synthesis, such as /Y-acetylgalactosaminyltransferase II (EC 2.4.1.1 75). 
Suitable mannosyltransferases include a(l,2) mannosyltransferase, a(l,3) 
mannosyltransferase, P(l,4) mannosyltransferase, Dol-P-Man synthase, OChl, and Pmtl. 
Xylosyltransferases include, for example, protein xylosyltransferase (EC 2.4.2.26). 

[0143] In some embodiments, eukaryotic TY-acetylgalactosaminyltransferases are expressed 
25 in bacteria and refolded using the methods of this disclosure. A number of GalNAcT 

enzymes have been isolated and characterized, e.g., GalNAcTl, accession number X85018; 
GalNAcT2, accession number X85019 (both described in White et al, J. Biol. Chem. 
270:24156-24165 (1995)); and GalNAcT3, accession number X92689 (described in Bennett 
et al, J. Biol. Chem. 271:17006-17012 (1996)). 

30 IV. Nucleic acids 

[0144] Nucleic acids that encode glycosyltransferases, and methods of obtaining such 
nucleic acids, are known to those of skill in the art. Suitable nucleic acids (e.g., cDNA, 
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genomic, or subsequences (probes)) can be cloned, or amplified by in vitro methods such as 
the polymerase chain reaction (PCR), the ligase chain reaction (LCR), the transcription-based 
amplification system (TAS), or the self-sustained sequence replication system (SSR). A wide 
variety of cloning and in vitro amplification methodologies are well-known to persons of 
5 skill. Examples of these techniques and instructions sufficient to direct persons of skill 
through many cloning exercises are found in Berger and Kimmel, Guide to Molecular 
Cloning Techniques, Methods in Enzymology 152 Academic Press, Inc., San Diego, CA 
(Berger); Sambrook et al. (1989) Molecular Cloning - A Laboratory Manual (2nd ed.) Vol. 
1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor Press, NY, (Sambrook et al); 
10 Current Protocols in Molecular Biology, F.M. Ausubel et al., eds., Current Protocols, a joint 
venture between Greene Publishing Associates, Inc. and John Wiley & Sons, Inc., (1994 
Supplement) (Ausubel); Cashion et al., U.S. patent number 5,017,478; and Carr, European 
Patent No. 0,246,864. 

[0145] A DNA that encodes a glycosyltransferase, or a subsequences thereof, can be 

15 prepared by any suitable method described above, including, for example, cloning and 

restriction of appropriate sequences with restriction enzymes. In one preferred embodiment, 
nucleic acids encoding glycosyltransferases are isolated by routine cloning methods. A 
nucleotide sequence of a glycosyltransferase as provided in, for example, GenBank or other 
sequence database (see above) can be used to provide probes that specifically hybridize to a 

20 glycosyltransferase gene in a genomic DNA sample, or to an mRNA, encoding a 

glucosyltransferase, in a total RNA sample (e.g., in a Southern or Northern blot). Once the 
target nucleic acid encoding a glycosyltransferase is identified, it can be isolated according to 
standard methods known to those of skill in the art (see, e.g., Sambrook et al. (1989) 
Molecular Cloning: A Laboratory Manual, 2nd Ed., Vols. 1-3, Cold Spring Harbor 

25 Laboratory; Berger and Kimmel (1987) Methods in Enzymology, Vol. 152: Guide to 

Molecular Cloning Techniques, San Diego: Academic Press, Inc.; or Ausubel et al. (1987) 
Current Protocols in Molecular Biology, Greene Publishing and Wiley-Interscience, New 
York). Further, the isolated nucleic acids can be cleaved with restriction enzymes to create 
nucleic acids encoding the full-length glycosyltransferse, or subsequences thereof, e.g., 

30 containing subsequences encoding at least a subsequence of a stem region or catalytic domain 
of a glycosyltransferase. These restriction enzyme fragments, encoding a glycosyltransferase 
or subsequences thereof, may then be ligated, for example, to produce a nucleic acid 
encoding a recombinant glycosyltransferase fusion protein. 
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[0146] A nucleic acid encoding a glycosyltransferase, or a subsequence thereof, can be 
characterized by assaying for the expressed product. Assays based on the detection of the 
physical, chemical, or immunological properties of the expressed protein can be used. For 
example, one can identify a cloned glycosyltransferase, including a glycosyltransferase fusion 
protein, by the ability of a protein encoded by the nucleic acid to catalyze the transfer of a 
saccharide from a donor substrate to an acceptor substrate. In a preferred method, capillary 
electrophoresis is employed to detect the reaction products. This highly sensitive assay 
involves using either saccharide or disaccharide aminophenyl derivatives which are labeled 
with fluorescein as described in Wakarchuk et al. (1996) J. Biol. Chem. 271 (45): 28271-276. 
For example, to assay for a Neisseria IgtC enzyme, either FCHASE-AP-Lac or FCHASE-AP- 
Gal can be used, whereas for the Neisseria IgtB enzyme an appropriate reagent is FCHASE- 
AP-GlcNAc (Id.). 

[0147] Also, a nucleic acid encoding a glycosyltransferase, or a subsequence thereof, can 
be chemically synthesized. Suitable methods include the phosphotriester method of Narang 
15 et al. (1979) Meth. Enzymol. 68: 90-99; the phosphodiester method of Brown et al. (1979) 
Meth. Enzymol. 68: 109-151; the diethylphosphoramidite method of Beaucage et al. (1981) 
Tetra. Lett., 22: 1859-1 862; and the solid support method of U.S. Patent No. 4,458,066. 
Chemical synthesis produces a single stranded oligonucleotide. This can be converted into 
double stranded DNA by hybridization with a complementary sequence, or by polymerization 
20 with a DNA polymerase using the single strand as a template. One of skill recognizes that 
while chemical synthesis of DNA is often limited to sequences of about 100 bases, longer 
sequences may be obtained by the ligation of shorter sequences. 

[0148] Nucleic acids encoding glycosyltransferases, or subsequences thereof, can be cloned 
using DNA amplification methods such as polymerase chain reaction (PCR). Thus, for 

25 example, the nucleic acid sequence or subsequence is PCR amplified, using a sense primer 

containing one restriction enzyme site (e.g., Ndel) and an antisense primer containing another 
restriction enzyme.site (e.g., HindHI). This will produce a nucleic acid encoding the desired 
glycosyltransferase or subsequence and having terminal restriction enzyme sites. This 
nucleic acid can then be easily ligated into a vector containing a nucleic acid encoding the 

30 second molecule and having the appropriate corresponding restriction enzyme sites. Suitable 
PCR primers can be determined by one of skill in the art using the sequence information 
provided in GenBank or other sources. Appropriate restriction enzyme sites can also be 
added to the nucleic acid encoding the glycosyltransferase protein or protein subsequence by 
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site-directed mutagenesis. The plasmid containing the glycosyltransferase-encoding 
nucleotide sequence or subsequence is cleaved with the appropriate restriction endonuclease 
and then ligated into an appropriate vector for amplification and/or expression according to 
standard methods. Examples of techniques sufficient to direct persons of skill through in 
5 vitro amplification methods are found in Berger, Sambrook, and Ausubel, as well as Mullis et 
al, (1987) U.S. Patent No. 4,683,202; PCR Protocols A Guide to Methods and Applications 
(Innis et al., eds) Academic Press Inc. San Diego, CA (1990) (Innis); Arnheim & Levinson 
(October 1, 1990) C&EN 36-47; The Journal Of NIH Research (1991) 3: 81-94; (Kwoh et al. 
(1989) Proc. Natl. Acad. Sci. USA 86: 1 173; Guatelli et al. (1990) Proc. Natl. Acad. Sci. USA 
10 87, 1874; Lomell et al. (1989) J. Clin. Chem., 35: 1826; Landegren et al, (1988) Science 

241: 1077-1080; Van Brunt (1990) Biotechnology 8: 291-294; Wu and Wallace (1989) Gene 
4: 560; and Barringer et al. (1990) Gene 89: 117. 

[0149] Other physical properties of a cloned glycosyltransferase protein, including 
glycosyltransferase fusion protein, expressed from a particular nucleic acid, can be compared 

15 to properties of known glycosyltransferases to provide another method of identifying suitable 
sequences or domains of the glycosyltransferase that are determinants of acceptor substrate 
specificity and/or catalytic activity. Alternatively, a putative glycosyltransferase gene or 
recombinant glycosyltransferase gene can be mutated, and its role as glycosyltransferase, its 
ability to be refolded, or the role of particular sequences or domains established by detecting 

20 a variation in the structure of a carbohydrate normally produced by the unmutated, naturally- 
occurring, or control glycosyltransferase. 

[0150] Functional domains of cloned glycosyltransferases can be identified by using 
standard methods for mutating or modifying the glycosyltransferases and testing the modified 
or mutated proteins for activities such as acceptor substrate activity and/or catalytic activity, 
25 as described herein. The functional domains of the various glycosyltransferases can be used 
to construct nucleic acids encoding recombinant glycosyltransferase fusion proteins 
comprising the functional domains of one or more glycosyltransferases. These fusion 
proteins can then be tested for the desired acceptor substrate or catalytic activity. 

[0151] In an exemplary approach to cloning recombinant glycosyltransferase fusion 
30 proteins, the known nucleic acid or amino acid sequences of cloned glycosyltransferases are 
aligned and compared to determine the amount of sequence identity between various 
glycosyltransferases. This information can be used to identify and select protein domains that 
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confer or modulate glycosyltransferase activities, e.g., acceptor substrate activity and/or 
catalytic activity based on the amount of sequence identity between the glycosyltransferases 
of interest. For example, domains having sequence identity between the glycosyltransferases 
of interest, and that are associated with a known activity, can be used to construct 
5 recombinant glycosyltransferase fusion proteins containing that domain, and having the 
activity associated with that domain (e.g., acceptor substrate specificity and/or catalytic 
activity). 

V. Expression of recombinant glycosyltranferases 

[0152] Recombinant eukaryotic glycosyltransferases can be expressed in a variety of host 

10 cells, including E. coli, other bacterial hosts, yeast, and various higher eukaryotic cells such 
as the COS, CHO and HeLa cells lines and myeloma cell lines. The host cells can be 
mammalian cells, plant cells, or microorganisms, such as, for example, yeast cells, bacterial 
cells, or filamentous fungal cells. Examples of suitable host cells include, for example, 
Azotobacter sp. (e.g., A. vinelandii), Pseudomonas sp., Rhizobium sp., Erwinia sp., 

15 Escherichia sp. (e.g., E. coli), Bacillus, Pseudomonas, Proteus, Salmonella, Serratia, 

Shigella, Rhizobia, Vitreoscilla, Paracoccus and Klebsiella sp., among many others. The 
cells can be of any of several genera, including Saccharomyces (e.g., S. cerevisiae), Candida 
(e.g., C. utilis, C. parapsilosis, C. krusei, C. versatilis, C. lipolytica, C. zeylanoides, C. 
guilliermondii, C. albicans, and C. humicola), Pichia (e.g., P.farinosa and P. ohmeri), 

20 Torulopsis (e.g., T. Candida, T. sphaerica, T. xylinus, T.famata, and T. versatilis), 

Debaryomyces (e.g., D. subglobosus, D. cantarellii, D. globosus, D. hansenii, andD. 
japonicus), Zygosaccharomyces (e.g., Z. rouxii and Z. bailii), Kluyveromyces (e.g., K. 
marxianus), Hansenula (e.g., H. anomala and H.jadinii), and Brettanomyces (e.g.,B. 
lambicus and B. anomalus). Examples of useful bacteria include, but are not limited to, 

25 Escherichia, Enterobacter, Azotobacter, Erwinia, Klebsiella. 

[0153] Typically, the polynucleotide that encodes the fusion protein is placed under the 
control of a promoter that is functional in the desired host cell. An extremely wide variety of 
promoters are well known, and can be used in the expression vectors of the invention, 
depending on the particular application. Ordinarily, the promoter selected depends upon the 
30 cell in which the promoter is to be active. Other expression control sequences such as 
ribosome binding sites, transcription termination sites and the like are also optionally 
included. Constructs that include one or more of these control sequences are termed 
"expression cassettes." Accordingly, the invention provides expression cassettes into which 



37 



the nucleic acids that encode fusion proteins are incorporated for high level expression in a 
desired host cell. 

[0154] Expression control sequences that are suitable for use in a particular host cell are 
often obtained by cloning a gene that is expressed in that cell. Commonly used prokaryotic 
5 control sequences, which are defined herein to include promoters for transcription initiation, 
optionally with an operator, along with ribosome binding site sequences, include such 
commonly used promoters as the beta-lactamase (penicillinase) and lactose (lac) promoter 
systems (Change et al, Nature (1977) 198: 1056), the tryptophan (trp) promoter system 
(Goeddel et al, Nucleic Acids Res. (1980) 8: 4057), the tac promoter (DeBoer, et al, Proc. 
10 Natl. Acad. Sci. U.S.A. (1983) 80:21-25); and the lambda-derived P L promoter and N-gene 
ribosome binding site (Shimatake et al, Nature (1981) 292: 128). The particular promoter 
system is not critical to the invention, any available promoter that functions in prokaryotes 
can be used. 

[0155] For expression of recombinant eukaryotic glycosyltransferases in prokaryotic cells 
1 5 other than E. coli, a promoter that functions in the particular prokaryotic species is required. 
Such promoters can be obtained from genes that have been cloned from the species, or 
heterologous promoters can be used. For example, the hybrid trp-lac promoter functions in 
Bacillus in addition to E. coli. 

[0156] A ribosome binding site (RBS) is conveniently included in the expression cassettes 
20 of the invention. An RBS in E. coli, for example, consists of a nucleotide sequence 3-9 
nucleotides in length located 3-11 nucleotides upstream of the initiation codon (Shine and 
Dalgarno, Nature (1975) 254: 34; Steitz, In Biological regulation and development: Gene 
expression (ed. R.F. Goldberger), vol. 1, p. 349, 1979, Plenum Publishing, NY). 

[0157] For expression of the recombinant eukaryotic glycosyltransferases in yeast, 
25 convenient promoters include GAL1-10 (Johnson and Davies (1984) Mol. Cell. Biol. 4:1440- 
1448) ADH2 (Russell et al. (1983) J. Biol. Chem. 258:2674-2682), PH05 {EMBOJ. (1982) 
6:675-680), and MFcc (Herskowitz and Oshima (1982) in The Molecular Biology of the Yeast 
Saccharomyces (eds. Strathern, Jones, and Broach) Cold Spring Harbor Lab., Cold Spring 
Harbor, N.Y., pp. 181-209). Another suitable promoter for use in yeast is the ADH2/GAPDH 
30 hybrid promoter as described in Cousens' et al, Gene 61 :265-275 (1987). For filamentous 
fungi such as, for example, strains of the fungi Aspergillus (McKnight et al, U.S. Patent No. 
4,935,349), examples of useful promoters include those derived from Aspergillus nidulans 
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glycolytic genes, such as the ADH3 promoter (McKnight et al., EMBO J. 4: 2093 2099 
(1985)) and the tpiA promoter. An example of a suitable terminator is the ADH3 terminator 
(McKnight et al.). 

[0158] Suitable constitutive promoters for use in plants include, for example, the 
5 cauliflower mosaic virus (CaMV) 35S transcription initiation region and region VI 

promoters, the 1'- or 2'- promoter derived from T-DNA of Agrobacterium tumefaciens, and 
other promoters active in plant cells that are known to those of skill in the art. Other suitable 
promoters include the full-length transcript promoter from Figwort mosaic virus, actin 
promoters, histone promoters, tubulin promoters, or the mannopine synthase promoter 

1 0 (MAS). Other constitutive plant promoters include various ubiquitin or polyubiquitin 

promoters derived from, inter alia, Arabidopsis (Sun and Callis, Plant J., 1 1(5): 1017-1027 
(1997)), the mas, Mac or DoubleMac promoters (described in united States Patent No. 
5,106,739 and by Comai et al., Plant Mol. Biol. 15:373-381 (1990)) and other transcription 
initiation regions from various plant genes known to those of skill in the art. Useful 

15 promoters for plants also include those obtained from Ti- or Ri-plasmids, from plant cells, 
plant viruses or other hosts where the promoters are found to be functional in plants. 
Bacterial promoters that function in plants, and thus are suitable for use in the methods of the 
invention include the octopine synthetase promoter, the nopaline synthase promoter, and the 
manopine synthetase promoter. Suitable endogenous plant promoters include the ribulose- 

20 1 ,6-biphosphate (RUBP) carboxylase small subunit (ssu) promoter, the (a-conglycinin 
promoter, the phaseolin promoter, the ADH promoter, and heat-shock promoters. 

[0159] Either constitutive or regulated promoters can be used in the present invention. 
Regulated promoters can be advantageous because the host cells can be grown to high 
densities before expression of the fusion proteins is induced. High level expression of 

25 heterologous proteins slows cell growth in some situations. An inducible promoter is a 
promoter that directs expression of a gene where the level of expression is alterable by 
environmental or developmental factors such as, for example, temperature, pH, anaerobic or 
aerobic conditions, light, transcription factors and chemicals. Such promoters are referred to 
herein as "inducible" promoters, which allow one to control the timing of expression of the 

30 glycosyltransferase or enzyme involved in nucleotide sugar synthesis. For E. coli and other 
bacterial host cells, inducible promoters are known to those of skill in the art. These include, 
for example, the lac promoter, the bacteriophage lambda P L promoter, the hybrid trp-lac 
promoter (Amann et al. (1983) Gene 25: 167; de Boer et al. (1983) Proc. Nat 'I. Acad. Sci. 
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USA 80: 21), and the bacteriophage T7 promoter (Studier et al. (1986) J. Mol. Biol ; Tabor et 
al. (1985) Proc. Nat 'I. Acad. Sci. USA 82: 1074-8). These promoters and their use are 
discussed in Sambrook et al., supra. A particularly preferred inducible promoter for 
expression in prokaryotes is a dual promoter that includes a tac promoter component linked 
5 to a promoter component obtained from a gene or genes that encode enzymes involved in 
galactose metabolism (e.g., a promoter from a UDPgalactose 4-epimerase gene (galE)). The 
dual tac-gal promoter, which is described in PCT Patent Application Publ. No. WO98/201 11, 
provides a level of expression that is greater than that provided by either promoter alone. 

[0160] Inducible promoters for use in plants are known to those of skill in the art (see, e.g., 
10 references cited in Kuhlemeier et al (1 987) Ann. Rev. Plant Physiol. 38:22 1), and include 
those of the 1,5-ribulose bisphosphate carboxylase small subunit genes of Arabidopsis 
thaliana (the "ssu" promoter), which are light-inducible and active only in photosynthetic 
tissue. 

[0161] Inducible promoters for other organisms are also well known to those of skill in the 
15 art. These include, for example, the arabinose promoter, the lacZ promoter, the 
metallothionein promoter, and the heat shock promoter, as well as many others. 

[0162] A construct that includes a polynucleotide of interest operably linked to gene 
expression control signals that, when placed in an appropriate host cell, drive expression of 
the polynucleotide is termed an "expression cassette." Expression cassettes that encode the 

20 fusion proteins of the invention are often placed in expression vectors for introduction into 
the host cell. The vectors typically include, in addition to an expression cassette, a nucleic 
acid sequence that enables the vector to replicate independently in one or more selected host 
cells. Generally, this sequence is one that enables the vector to replicate independently of the 
host chromosomal DNA and includes origins of replication or autonomously replicating 

25 sequences. Such sequences are well known for a variety of bacteria. For instance, the origin 
of replication from the plasmid pBR322 is suitable for most Gram-negative bacteria. 
Alternatively, the vector can replicate by becoming integrated into the host cell genomic 
complement and being replicated as the cell undergoes DNA replication. A preferred 
expression vector for expression of the enzymes is in bacterial cells is pTGK, which includes 

30 a dual tac-gal promoter and is described in PCT Patent Application Publ. NO. WO98/201 1 1 . 

[0163] It may also be desirable to add regulatory sequences which allow the regulation of 
the expression of the polypeptide relative to the growth of the host cell. Examples of 
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regulatory systems are those which cause the expression of the gene to be turned on or off in 
response to a chemical or physical stimulus, including the presence of a regulatory 
compound. Regulatory systems in prokaryotic systems include the lac, tac, and trp operator 
systems. In yeast, the ADH2 system or GAL1 system may be used. In filamentous fungi, the 
5 TAKA a-amylase promoter, Aspergillus niger glucoamylase promoter, and Aspergillus 
oryzae glucoamylase promoter may be used as regulatory sequences. 

[0164] The construction of polynucleotide constructs generally requires the use of vectors 
able to replicate in bacteria. A plethora of kits are commercially available for the purification 
of plasmids from bacteria (see, for example, EasyPrepJ, FlexiPrepJ, both from Pharmacia 
10 Biotech; StrataCleanJ, from Stratagene; and, QIAexpress Expression System, Qiagen). The 
isolated and purified plasmids can then be further manipulated to produce other plasmids, and 
used to transfect cells. Cloning in Streptomyces or Bacillus is also possible. 

[0165] Selectable markers are often incorporated into the expression vectors used to 
express the polynucleotides of the invention. These genes can encode a gene product, such as 

15 a protein, necessary for the survival or growth of transformed host cells grown in a selective 
culture medium. Host cells not transformed with the vector containing the selection gene will 
not survive in the culture medium. Typical selection genes encode proteins that confer 
resistance to antibiotics or other toxins, such as ampicillin, neomycin, kanamycin, 
chloramphenicol, or tetracycline. Alternatively, selectable markers may encode proteins that 

20 complement auxotrophic deficiencies or supply critical nutrients not available from complex 
media, e.g., the gene encoding D-alanine racemase for Bacilli. Often, the vector will have 
one selectable marker that is functional in, e.g., E. coli, or other cells in which the vector is 
replicated prior to being introduced into the host cell. A number of selectable markers are 
known to those of skill in the art and are described for instance in Sambrook et al., supra. A 

25 preferred selectable marker for use in bacterial cells is a kanamycin resistance marker (Vieira 
and Messing, Gene 19: 259 (1982)). Use of kanamycin selection is advantageous over, for 
example, ampicillin selection because ampicillin is quickly degraded by p-lactamase in 
culture medium, thus removing selective pressure and allowing the culture to become 
overgrown with cells that do not contain the vector. 

30 [0166] Construction of suitable vectors containing one or more of the above listed 

components employs standard ligation techniques as described in the references cited above. 
Isolated plasmids or DNA fragments are cleaved, tailored, and re-ligated in the form desired 
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to generate the plasmids required. To confirm correct sequences in plasmids constructed, the 
plasmids can be analyzed by standard techniques such as by restriction endonuclease 
digestion, and/or sequencing according to known methods. Molecular cloning techniques to 
achieve these ends are known in the art. A wide variety of cloning and in vitro amplification 
5 methods suitable for the construction of recombinant nucleic acids are well-known to persons 
of skill. Examples of these techniques and instructions sufficient to direct persons of skill 
through many cloning exercises are found in Berger and Kimmel, Guide to Molecular 
Cloning Techniques, Methods in Enzymology, Volume 152, Academic Press, Inc., San Diego, 
CA (Berger); and Current Protocols in Molecular Biology, F.M. Ausubel et at, eds., Current 
10 Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley & 
Sons, Inc., (1998 Supplement) (Ausubel). 

[0167] A variety of common vectors suitable for use as starting materials for constructing 
the expression vectors of the invention are well known in the art. For cloning in bacteria, 
common vectors include pBR322 derived vectors such as pBLUESCRIPT™, and X.-phage 

15 derived vectors. In yeast, vectors include Yeast Integrating plasmids (e.g., YIp5) and Yeast 

Replicating plasmids (the YRp series plasmids) and pGPD-2. Expression in mammalian cells 
can be achieved using a variety of commonly available plasmids, including pSV2, pBC12BI, 
and p91023, as well as lytic virus vectors (e.g., vaccinia virus, adeno virus, and baculo virus), 
episomal virus vectors (e.g., bovine papillomavirus), and retroviral vectors (e.g., murine 

20 retroviruses). 

[0168] The methods for introducing the expression vectors into a chosen host cell are not 
particularly critical, and such methods are known to those of skill in the art. For example, the 
expression vectors can be introduced into prokaryotic cells, including E. coli, by calcium 
chloride transformation, and into eukaryotic cells by calcium phosphate treatment or 
25 electroporation. Other transformation methods are also suitable. 

[0169] Translational coupling may be used to enhance expression. The strategy uses a 
short upstream open reading frame derived from a highly expressed gene native to the 
translational system, which is placed downstream of the promoter, and a ribosome binding 
site followed after a few amino acid codons by a termination codon. Just prior to the 
30 termination codon is a second ribosome binding site, and following the termination codon is a 
start codon for the initiation of translation. The system dissolves secondary structure in the 
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RNA, allowing for the efficient initiation of translation. See Squires, et. al. (1988), J. Biol. 
Chem. 263: 16297-16302. 

[0170] The recombinant eukaryotic glycosyltransferases of the invention can also be 
further linked to other bacterial proteins. This approach often results in high yields, because 
5 normal prokaryotic control sequences direct transcription and translation. In E. coli, lacZ 
fusions are often used to express heterologous proteins. Suitable vectors are readily 
available, such as the pUR, pEX, and pMRlOO series (see, e.g., Sambrook et al, supra.). For 
certain applications, it may be desirable to cleave the non-glycosyltransferase and/or 
accessory enzyme amino acids from the fusion protein after purification. This can be 
10 accomplished by any of several methods known in the art, including cleavage by cyanogen 
bromide, a protease, or by Factor X a (see, e.g., Sambrook et al, supra.; Itakura et al, Science 
(1977) 198: 1056; Goeddel et al, Proc. Natl Acad. Sci. USA (1979) 76: 106; Nagai et al, 
Nature (1984) 309: 810; Sung et al, Proc. Natl Acad. Sci. USA (1986) 83: 561). Cleavage 
sites can be engineered into the gene for the fusion protein at the desired point of cleavage. 

15 [0171] More than one recombinant eukaryotic glycosyltransferase may be expressed in a 
single host cell by placing multiple transcriptional cassettes in a single expression vector, or 
by utilizing different selectable markers for each of the expression vectors which are 
employed in the cloning strategy. 

[0172] A suitable system for obtaining recombinant proteins from E. coli which maintains 
20 the integrity of their N-termini has been described by Miller et al. Biotechnology 7:698-704 
(1989). In this system, the gene of interest is produced as a C-terminal fusion to the first 76 
residues of the yeast ubiquitin gene containing a peptidase cleavage site. Cleavage at the 
junction of the two moieties results in production of a protein having an intact authentic N- 
terminal reside. 

25 [0173] The expression vectors of the invention can be transferred into the chosen host cell 
by well-known methods such as calcium chloride transformation for E. coli and calcium 
phosphate treatment or electroporation for mammalian cells. Cells transformed by the 
plasmids can be selected by resistance to antibiotics conferred by genes contained on the 
plasmids, such as the amp, gpt, neo and hyg genes. 

30 VI. Proteins and protein purification 

[0174] The recombinant eukaryotic glycosyltransferase proteins can be purified according 
to standard procedures of the art, including ammonium sulfate precipitation, affinity columns, 
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column chromatography, gel electrophoresis and the like (see, generally, R. Scopes, Protein 
Purification, Springer- Verlag, N.Y. (1982), Deutscher, Methods in Enzymology Vol. 182: 
Guide to Protein Purification., Academic Press, Inc. N.Y. (1990)). Substantially pure 
compositions of at least about 70 to 90%, homogeneity are preferred; more preferably at least 
5 91%, 92%, 93%, 94%, 95%, 96%, or 97%; and 98 to 99% or more homogeneity are most 
preferred. The purified proteins may also be used, e.g., as immunogens for antibody 
production. 

[01 75] To facilitate purification of the recombinant eukaryotic glycosyltransferase proteins 
of the invention, the nucleic acids that encode the recombinant eukaryotic glycosyltransferase 

10 proteins can also include a coding sequence for an epitope or "tag" for which an affinity 

binding reagent is available, i.e. a purification tag. Examples of suitable epitopes include the 
myc and V-5 reporter genes; expression vectors useful for recombinant production of fusion 
proteins having these epitopes are commercially available (e.g., Invitrogen (Carlsbad CA) 
vectors pcDNA3.1/Myc-His and pcDNA3.1/V5-His are suitable for expression in 

15 mammalian cells). Additional expression vectors suitable for attaching a tag to the fusion 

proteins of the invention, and corresponding detection systems are known to those of skill in 
the art, and several are commercially available (e.g., FLAG" (Kodak, Rochester NY). 
Another example of a suitable tag is a polyhistidine sequence, which is capable of binding to 
metal chelate affinity ligands. Typically, six adjacent histidines are used, although one can 

20 use more or less than six. Suitable metal chelate affinity ligands that can serve as the binding 
moiety for a polyhistidine tag include nitrilo-tri-acetic acid (NT A) (Hochuli, E. (1990) 
"Purification of recombinant proteins with metal chelating adsorbents" In Genetic 
Engineering: Principles and Methods, J.K. Setlow, Ed., Plenum Press, NY; commercially 
available from Qiagen (Santa Clarita, CA)). 

25 [0176] Purification tags also include maltose binding domains and starch binding domains. 
Purification of maltose binding domain proteins is known to those of skill in the art. Starch 
binding domains are described in WO 99/15636, herein incorporated by reference. Affinity 
purification of a fusion protein comprising a starch binding domain using a betacylodextrin 
(BCD)-derivatized resin is described in USSN 60/468,374, filed May 5, 2003, herein 

30 incorporated by reference in its entirety. 

[0177] Other haptens that are suitable for use as tags are known to those of skill in the art 
and are described, for example, in the Handbook of Fluorescent Probes and Research 
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Chemicals (6th Ed., Molecular Probes, Inc., Eugene OR). For example, dinitrophenol (DNP), 
digoxigenin, barbiturates (see, e.g., US Patent No. 5,414,085), and several types of 
fluorophores are useful as haptens, as are derivatives of these compounds. Kits are 
commercially available for linking haptens and other moieties to proteins and other 
5 molecules. For example, where the hapten includes a thiol, a heterobifunctional linker such as 
SMCC can be used to attach the tag to lysine residues present on the capture reagent. 

[0178] One of skill would recognize that modifications can be made to the 
glycosyltransferase catalytic or functional domains and/or accessory enzyme catalytic 
domains without diminishing their biological activity. Some modifications may be made to 

10 facilitate the cloning, expression, or incorporation of the catalytic domain into a fusion 
protein. Such modifications are well known to those of skill in the art and include, for 
example, the addition of codons at either terminus of the polynucleotide that encodes the 
catalytic domain to provide, for example, a methionine added at the amino terminus to 
provide an initiation site, or additional amino acids (e.g., poly His) placed on either terminus 

15 to create conveniently located restriction enzyme sites or termination codons or purification 
sequences. 

VII. Uses of refolded glycosyltransferases 

[01 79] The invention provides recombinant eukaryotic glycosyltransferase proteins and 
methods of using the recombinant eukaryotic glycosyltransferase proteins to enzymatically 

20 synthesize glycoproteins, glycolipids, and oligosaccharide moieties, and to glycoPEGylate 
glycoproteins. The glycosyltransferase reactions of the invention take place in a reaction 
medium comprising at least one glycosyltransferase, acceptor substrate, and donor substrate, 
and typically a soluble divalent metal cation. In some embodiments, accessory enzymes and 
substrates for the accessory enzyme catalytic moiety are also present, so that the accessory 

25 enzymes can synthesize the donor substrate for the glycosyltransferase. The recombinant 
eukaryotic glycosyltransferase proteins and methods of the present invention rely on the use 
the recombinant eukaryotic glycosyltransferase proteins to catalyze the addition of a 
saccharide to an acceptor substrate. 

[0180] A number of methods of using glycosyltransferases to synthesize glycoproteins and 
30 glycolipids having desired oligosaccharide moieties are known. Exemplary methods are 
described, for instance, WO 96/32491, Ito et al. (1993) Pure Appl. Chem. 65: 753, and US 
Patents 5, 352,670, 5,374,541, and 5,545,553. 
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[0181] The recombinant eukaryotic glycosyltransferase proteins prepared as described 
herein can be used in combination with additional glycosyltransferases, that may or may not 
have required refolding for activity. For example, one can use a combination of refolded 
recombinant eukaryotic glycosyltransferase protein and a bacterial glycosyltranferase, which 
5 may or may not have been refolded after isolation from a host cell. Similarly, the 
recombinant eukaryotic glycosyltransferase can be used with recombinant accessory 
enzymes, which may or may not be part of the fusion protein. 

[0182] The products produced by the above processes can be used without purification. In 
some embodiments, oligosaccharides are produced. Standard, well known techniques, for 

10 example, thin or thick layer chromatography, ion exchange chromatography, or membrane 

filtration can be used for recovery of glycosylated saccharides. Also, for example, membrane 
filtration, utilizing a nanofiltration or reverse osmotic membrane as described in commonly 
assigned AU Patent No. 735695 may be used. As a further example, membrane filtration 
wherein the membranes have a molecular weight cutoff of about 1000 to about 10,000 can be 

15 used to remove proteins. As another example, nanofiltration or reverse osmosis can then be 
used to remove salts. Nanofilter membranes are a class of reverse osmosis membranes which 
pass monovalent salts but retain polyvalent salts and uncharged solutes larger than about 200 
to about 1000 Daltons, depending upon the membrane used. Thus, for example, the 
oligosaccharides produced by the compositions and methods of the present invention can be 

20 retained in the membrane and contaminating salts will pass through. 

VIII. Donor substrate/ Acceptor substrates 

[0183] Suitable donor substrates used by the recombinant glycosyltransferase fusion 
proteins and methods of the invention include, but are not limited to, UDP-Glc, UDP- 
GlcNAc, UDP-Gal, UDP-GalNAc, GDP-Man, GDP-Fuc, UDP-GlcUA, and CMP-sialic acid. 
25 Guo et al. , Applied Biochem. and Biotech. 68 : 1 -20 (1 997) 

[0184] Suitable acceptor substrates used by the recombinant glycosyltransferase fusion 
proteins and methods of the invention include, but are not limited to, polysaccharides, 
oligosaccharides, proteins, lipids, gangliosides and other biological structures (e.g., whole 
cells) that can be modified by the methods of the invention. Exemplary structures, which can 
30 be modified by the methods of the invention include any a of a number glycolipids, 

glycoproteins and carbohydrate structures on cells known to those skilled in the art as set 
forth is Table 1. 
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Table 1 



Hormones and Growth Factors 


Receptors and Chimeric Receptors 


• G-CSF 


• 


CD4 


• vjivi-i^or 




Tumor Necrosis Factor (TNF) receptor 






Alpha-CD20 


EPO 




MAb-CD20 


• EPO variants 




MAb-alpha-CD3 






MAb-TNF receptor 


• Leptin 




MAb-CD4 






PSGL-1 


Enzvmes and Inhibitors 




MAb-PSGL-1 


• t-PA 


# 


Complement 


• t-PA variants 




GlyCAM or its chimera 


• Urokinase 




N-CAM or its chimera 


T7o/^+#-vt*o \/TT \/TTT TV Y 

• r actors v ul, v jlu, i^v, a 




LFA-3 


• DNase 




CTLA-IV 


• Glucocerebrosidase 






• Hirudin 


Monoclonal Antibodies (Immunoglobulins) 


• ocl antitrypsin 




MAb-anti-RSV 


• Antithrombin m 




MAb-anti-IL-2 receptor 






MAb-anti-CEA 


Cvtokines and Chimeric 




MAb-anti-platelet Ub/Ula receptor 


Cytokines 




MAb-anti-EGF 


• Interleukin-1 (IL-1), IB, 




MAb-anti-Her-2 receptor 


2, 3,4 






• Interferon-cc (IFN-a) 


Cells 


. IFN-a-2b 




Red blood cells 


. IFN-p 




White blood cells (e.g., T cells, B cells, dendritic 


. IFN-y 

• Chimeric diptheria toxin- 
IL-2 




cells, macrophages, NK cells, neutrophils, monocytes 
and the like 
• Stem cells 
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[0185] Examples of suitable acceptor substrates used in fucosyltransferase-catalyzed 
reactions, and examples of suitable acceptor substrates used in sialyltransferase-catalyzed 
reactions are described in Guo et al., Applied Biochem. and Biotech. 68: 1-20 (1 997), but are 
not limited thereto. 

5 IX. Glycosyltransferase reactions 

[0186] The recombinant eukaryotic glycosyltransferase proteins, acceptor substrates, donor 
substrates and other reaction mixture ingredients are combined by admixture in an aqueous 
reaction medium. The medium generally has a pH value of about 5 to about 8.5. The 
selection of a medium is based on the ability of the medium to maintain pH value at the 
10 desired level. Thus, in some embodiments, the medium is buffered to a pH value of about 
7.5. If a buffer is not used, the pH of the medium should be maintained at about 5 to 8.5, 
depending upon the particular glycosyltransferase used. For fucosyltransferases, the pH 
range is preferably maintained from about 6.0 to 8.0. For sialyltransferases, the range is 
preferably from about 5.5 to about 7.5. 

15 [0187] Enzyme amounts or concentrations are expressed in activity units, which is a 

measure of the initial rate of catalysis. One activity unit catalyzes the formation of 1 umol of 
product per minute at a given temperature (typically 37°C) and pH value (typically 7.5). 
Thus, 10 units of an enzyme is a catalytic amount of that enzyme where 10 umol of substrate 
are converted to 10 umol of product in one minute at a temperature of 37 °C and a pH value 

20 of7.5. 

[0188] The reaction mixture may include divalent metal cations (Mg 2+ , Mn 2+ ). The 
reaction medium may also comprise solubilizing detergents {e.g., Triton or SDS) and organic 
solvents such as methanol or ethanol, if necessary. The enzymes can be utilized free in 
solution or can be bound to a support such as a polymer. The reaction mixture is thus 
25 substantially homogeneous at the beginning, although some precipitate can form during the 
reaction. 

[0189] The temperature at which an above process is carried out can range from just above 
freezing to the temperature at which the most sensitive enzyme denatures. That temperature 
range is preferably about 0°C to about 45 °C, and more preferably at about 20°C to about 
30 37°C. 
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[0190J The reaction mixture so formed is maintained for a period of time sufficient to 
obtain the desired high yield of desired oligosaccharide determinants present on 
oligosaccharide groups attached to the glycoprotein to be glycosylated. For large-scale 
preparations, the reaction will often be allowed to proceed for between about 0.5-240 hours, 
5 and more typically between about 1-18 hours. 

[0191] One or more of the glycosyltransferase reactions can be carried out as part of a 
glycosyltransferase cycle. Preferred conditions and descriptions of glycosyltransferase cycles 
have been described. A number of glycosyltransferase cycles (for example, sialyltransferase 
cycles, galactosyltransferase cycles, and fucosyltransferase cycles) are described in U.S. 
10 Patent No. 5,374,541 and WO 9425615 A. Other glycosyltransferase cycles are described in 
Ichikawa et al. J. Am. Chem. Soc. 1 14:9283 (1992), Wong et al. J. Org. Chem. 57: 4343 

(1992) , DeLuca, etal, J. Am. Chem. Soc. 117:5869-5870 (1995), and Ichikawa et al In 
Carbohydrates and Carbohydrate Polymers. Yaltami, ed. (ATL Press, 1993). 

[0192] Other glycosyl transferases can be substituted into similar transferase cycles as have 
15 been described in detail for the fucosyltransferases and sialyltransferases. In particular, the 
glycosyltransferase can also be, for instance, glucosyltransferases, e.g., Alg8 (Stagljov et al, 
Proc. Natl. Acad. Set USA 91 :5977 (1994)) or Alg5 (Heesen et al. Eur. J. Biochem. 224:7 '1 
(1994)), N-acetylgalactosaminyltransferases such as, for example, a(l,3) N- 
acetylgalactosaminyltransferase, P(l,4) N-acetylgalactosaminyltransferases (Nagata et al. J. 
20 Biol. Chem. 267:12082-12089 (1992) and Smith et al. J. Biol Chem. 269:15162 (1994)) and 
polypeptide N-acetylgalactosaminyltransferase (Homa et al. J. Biol Chem. 268: 12609 

(1993) ). Suitable N-acetylglucosaminyltransferases include GnTI (2.4. 1.101, Hull et al, 
BBRC 176:608 (1991)), GnTH, and GnTIH (Iharaefa/. J. Biochem. 113:692 (1993)), GnTV 
(Shoreiban et al. J. Biol Chem. 268: 15381 (1993)), O-linked N- 

25 acetylglucosaminyltransferase (Bierhuizen et al. Proc. Natl. Acad. Sci. USA 89:9326 (1992)), 
N-acetylglucosamine-1 -phosphate transferase (Rajput et al. Biochem J. 285:985 (1992), and 
hyaluronan synthase. Suitable mannosyltransferases include cc(l,2) mannosyltransferase, 
a(l,3) mannosyltransferase, P(l,4) mannosyltransferase, Dol-P-Man synthase, OChl, and 
Pmtl. 

30 [0193] For the above glycosyltransferase cycles, the concentrations or amounts of the 
various reactants used in the processes depend upon numerous factors including reaction 
conditions such as temperature and pH value, and the choice and amount of acceptor 
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saccharides to be glycosylated. Because the glycosylation process permits regeneration of 
activating nucleotides, activated donor sugars and scavenging of produced PPi in the 
presence of catalytic amounts of the enzymes, the process is limited by the concentrations or 
amounts of the stoichiometric substrates discussed before. The upper limit for the 
5 concentrations of reactants that can be used in accordance with the method of the present 
invention is determined by the solubility of such reactants. 

[0194] Preferably, the concentrations of activating nucleotides, phosphate donor, the donor 
sugar and enzymes are selected such that glycosylation proceeds until the acceptor is 
consumed. The considerations discussed below, while in the context of a sialyltransferase, 
10 are generally applicable to other glycosyltransferase cycles. 

[0195] Each of the enzymes is present in a catalytic amount. The catalytic amount of a 
particular enzyme varies according to the concentration of that enzyme's substrate as well as 
to reaction conditions such as temperature, time and pH value. Means for determining the 
catalytic amount for a given enzyme under preselected substrate concentrations and reaction 
1 5 conditions are well known to those of skill in the art. 

X. Multienzyme oligosaccharide synthesis 

[0196] As discussed above, in some embodiments, two or more enzymes may be used to 
form a desired oligosaccharide determinant on a glycoprotein or glycolipid. For example, a 
particular oligosaccharide determinant might require addition of a galactose, a sialic acid, and 
20 a fucose in order to exhibit a desired activity. Accordingly, the invention provides methods 
in which two or more enzymes, e.g., glycosyltransferases, trans-sialidases, or 
sulfotransferases, are used to obtain high-yield synthesis of a desired oligosaccharide 
determinant. 

[0197] In a particularly preferred embodiment, one of the enzymes used is a 
25 sulfotransferase which sulfonates the saccharide or the peptide. Even more preferred is the 
use of a sulfotransferase to prepare a ligand for a selectin (Kimura et ah, Proc Natl Acad Sci 
USA 96(8):4530-5 (1999)). 

[0198] In some cases, a glycoprotein- or glycolipid linked oligosaccharide will include an 
acceptor substrate for the particular glycosyltransferase of interest upon in vivo biosynthesis 
30 of the glycoprotein or glycolipid. Such glycoproteins or glycolipids can be glycosylated 
using the recombinant glycosyltransferase fusion proteins and methods of the invention 
without prior modification of the glycosylation pattern of the glycoprotein or glycolipid, 
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respectively. In other cases, however, a glycoprotein or glycolipid of interest will lack a 
suitable acceptor substrate. In such cases, the methods of the invention can be used to alter 
the glycosylation pattern of the glycoprotein or glycolipid so that the glycoprotein-or 
glycolipid-linked oligosaccharides then include an acceptor substrate for the 
5 glycosyltransferase-catalyzed attachment of a preselected saccharide unit of interest to form a 
desired oligosaccharide moiety. 

[0199] Glycoprotein- or glycolipid linked oligosaccharides optionally can be first 
"trimmed," either in whole or in part, to expose either an acceptor substrate for the 
glycosyltransferase or a moiety to which one or more appropriate residues can be added to 
10 obtain a suitable acceptor substrate. Enzymes such as glycosyltransferases and 

endoglycosidases are useful for the attaching and trimming reactions. For example, a 
glycoprotein that displays "high mannose"-type oligosaccharides can be subjected to 
trimming by a mannosidase to obtain an acceptor substrate that, upon attachment of one or 
more preselected saccharide units, forms the desired oligosaccharide determinant. 

15 [0200] The methods are also useful for synthesizing a desired oligosaccharide moiety on a 
protein or lipid that is unglycosylated in its native form. A suitable acceptor substrate for the 
corresponding glycosyltransferase can be attached to such proteins or lipids prior to 
glycosylation using the methods of the present invention. See, e.g., US Patent No. 5,272,066 
for methods of obtaining polypeptides having suitable acceptors for glycosylation. 

20 [0201] Thus, in some embodiments, the invention provides methods for in vitro sialylation 
of saccharide groups present on a glycoconjugate that first involves modifying the 
glycoconjugate to create a suitable acceptor. 

XI. Conjugation of modified sugars to peptides 

[0202] The modified sugars are conjugated to a glycosylated or non-glycosylated peptide or 
25 protein using an appropriate enzyme to mediate the conjugation. Preferably, the 

concentrations of the modified donor sugar(s), enzyme(s) and acceptor peptide(s) or 
protein(s) are selected such that glycosylation proceeds until the acceptor is consumed. The 
considerations discussed below, while set forth in the context of a sialyltransferase, are 
generally applicable to other glycosyltransferase reactions. 

30 [0203] A number of methods of using glycosyltransferases to synthesize desired 

oligosaccharide structures are known and are generally applicable to the instant invention. 
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Exemplary methods are described, for instance, WO 96/32491, Ito et al., Pure Appl. Chem. 
65: 753 (1993), and U.S. Pat. Nos. 5,352,670, 5,374,541, and 5,545,553. 

[0204] In a some embodiments, an endoglycosidase is used in the reaction in combination 
with glycosyltransferases. The enzymes are used to alter a saccharide structure on the 
5 peptide at any point either before or after the addition of the modified sugar to the peptide. 

[0205] In another embodiment, the method makes use of one or more exo- or 
endoglycosidase. The glycosidase is typically a mutant, which is engineered to form glycosyl 
bonds rather than rupture them. The mutant glycanase typically includes a substitution of an 
amino acid residue for an active site acidic amino acid residue. For example, when the 
10 endoglycanase is endo-H, the substituted active site residues will typically be Asp at position 
130, Glu at position 132 or a combination thereof. The amino acids are generally replaced 
with serine, alanine, asparagine, or glutamine. 

[0206] The mutant enzyme catalyzes the reaction, usually by a synthesis step that is 
analogous to the reverse reaction of the endoglycanase hydrolysis step. In these 

15 embodiments, the glycosyl donor molecule {e.g., a desired oligo- or mono- saccharide 

structure) contains a leaving group and the reaction proceeds with the addition of the donor 
molecule to a GlcNAc residue on the protein. For example, the leaving group can be a 
halogen, such as fluoride. In other embodiments, the leaving group is a Asn, or a Asn- 
peptide moiety. In yet further embodiments, the GlcNAc residue on the glycosyl donor 

20 molecule is modified. For example, the GlcNAc residue may comprise a 1 ,2 oxazoline 
moiety. 

[0207] In a preferred embodiment, each of the enzymes utilized to produce a conjugate of 
the invention are present in a catalytic amount. The catalytic amount of a particular enzyme 
varies according to the concentration of that enzyme's substrate as well as to reaction 
25 conditions such as temperature, time and pH value. Means for deterniining the catalytic 
amount for a given enzyme under preselected substrate concentrations and reaction 
conditions are well known to those of skill in the art. 

[0208] The temperature at which an above process is carried out can range from just above 
freezing to the temperature at which the most sensitive enzyme denatures. Preferred 
30 temperature ranges are about 0 °C to about 55 °C, and more preferably about 20 ° C to about 
30 °C. In another exemplary embodiment, one or more components of the present method 
are conducted at an elevated temperature using a thermophilic enzyme. 
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[0209] The reaction mixture is maintained for a period of time sufficient for the acceptor to 
be glycosylated, thereby forming the desired conjugate. Some of the conjugate can often be 
detected after a few hours, with recoverable amounts usually being obtained within 24 hours 
or less. Those of skill in the art understand that the rate of reaction is dependent on a number 
5 of variable factors (e.g, enzyme concentration, donor concentration, acceptor concentration, 
temperature, solvent volume), which are optimized for a selected system. 

[0210] The present invention also provides for the industrial-scale production of modified 
peptides. As used herein, an industrial scale generally produces at least one gram of finished, 
purified conjugate. 

10 [0211] In the discussion that follows, the invention is exemplified by the conjugation of 

modified sialic acid moieties to a glycosylated peptide. The exemplary modified sialic acid is 
labeled with PEG. The focus of the following discussion on the use of PEG-modified sialic 
acid and glycosylated peptides is for clarity of illustration and is not intended to imply that 
the invention is limited to the conjugation of these two partners. One of skill understands that 

15 the discussion is generally applicable to the additions of modified glycosyl moieties other 
than sialic acid. Moreover, the discussion is equally applicable to the modification of a 
glycosyl unit with agents other than PEG including other water-soluble polymers, therapeutic 
moieties, and biomolecules. 

[0212] An enzymatic approach can be used for the selective introduction of PEGylated or 
20 PPGylated carbohydrates onto a peptide or glycopeptide. The method utilizes modified 

sugars containing PEG, PPG, or a masked reactive functional group, and is combined with 
the appropriate glycosyltransferase or glycosynthase. By selecting the glycosyltransferase 
that will make the desired carbohydrate linkage and utilizing the modified sugar as the donor 
substrate, the PEG or PPG can be introduced directly onto the peptide backbone, onto 
25 existing sugar residues of a glycopeptide or onto sugar residues that have been added to a 
peptide. 

[0213] An acceptor for the sialyltransferase is present on the peptide to be modified by the 
methods of the present invention either as a naturally occurring structure or one placed there 
recombinantly, enzymatically or chemically. Suitable acceptors, include, for example, 
30 galactosyl acceptors such as Galp 1 ,4GlcNAc, Galp 1 ,4GalNAc, Gaip 1 ,3 GalNAc, lacto-N- 
tetraose, Galpl,3GlcNAc, Galpl,3Ara, Gaipi,6GlcNAc, Gaipi,4Glc (lactose), and other 
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acceptors known to those of skill in the art (see, e.g., Paulson et al, J. Biol. Chem. 253: 5617- 
5624 (1978)). 

[0214] In one embodiment, an acceptor for the sialyltransferase is present on the 
glycopeptide to be modified upon in vivo synthesis of the glycopeptide. Such glycopeptides 
5 can be sialylated using the claimed methods without prior modification of the glycosylation 
pattern of the glycopeptide. Alternatively, the methods of the invention can be used to 
sialylate a peptide that does not include a suitable acceptor; one first modifies the peptide to 
include an acceptor by methods known to those of skill in the art. In an exemplary 
embodiment, a GalNAc residue is added by the action of a GalNAc transferase. 

10 [0215] In an exemplary embodiment, the galactosyl acceptor is assembled by attaching a 
galactose residue to an appropriate acceptor linked to the peptide, e.g., a GlcNAc. The 
method includes incubating the peptide to be modified with a reaction mixture that contains a 
suitable amount of a galactosyltransferase (e.g., gaipi,3 or gaipi,4), and a suitable galactosyl 
donor (e.g., UDP-galactose). The reaction is allowed to proceed substantially to completion 

15 or, alternatively, the reaction is terminated when a preselected amount of the galactose 
residue is added. Other methods of assembling a selected saccharide acceptor will be 
apparent to those of skill in the art. 

[0216] In yet another embodiment, glycopeptide-linked oligosaccharides are first 
"trimmed," either in whole or in part, to expose either an acceptor for the sialyltransferase or 
20 a moiety to which one or more appropriate residues can be added to obtain a suitable 

acceptor. Enzymes such as glycosyltransferases and endoglycosidases (see, for example U.S. 
Patent No. 5,716,812) are useful for the attaching and trimming reactions. 

[0217] Methods for conjugation of modified sugars to peptides or proteins are found e.g., in 
USSN 60/328,523 filed October 10, 2001 ; USSN 60/387,292, filed June 7, 2002; USSN 
25 60/391,777 filed June 25, 2002; USSN 60/404,249 filed August 16, 2002; and 

PCT/US02/32263; each of which are herein incorporated by reference for all purposes. 

EXAMPLES 

Example 1 : Refoldine Rat Liver ST3Gaim Expressed in Bacteria. 
Refolding rat liver GST-ST3GalIII fusion protein 
30 [0218] Rat liver //-acetyllactosaminide o>2,3-sialyltransferase (ST3Gaim) was cloned into 
pGEX-KT-Ext vector and expressed as GST-ST3-Gal m inclusion bodies in E.coli BL21 
cells. Inclusion bodies were refolded using a GSH/GSSG redox system. The refolded 
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enzyme, GST-ST3-GalIII, was active and transferred sialic acid to an LNnT sugar substrate 
and to asialylated glycoproteins, for example, transferrin and Factor IX. 

Cloning ST3Galin into pGEX-XT-KT vector 

[0219] Rat liver ST3-GalIII gene was cloned into BamHl and EcoRl sites of the pGEX- 
5 KT-Ext vector after PCR Amplification using the following primers: 

Sense Sial 5'Tm 5 ' -TTTGGATCC AAGCTAC ACTT ACTCC AATGG 

Antisense: Sial 3' Whole 5 '-TTTGAATTCTCAGATACCACTGCTTAAGTC 

Expression of GST-ST3GaIIII in E. colt BL21 cells 

1 0 [0220] pGEX-ST3Gaim, an expression vector comprising the ST3GalIH GST fusion, was 
transformed into chemically competent E. coli BL21 cells. Single colonies were picked, 
inoculated into five ml LB media with 100 ug/ml carbenicillin, and grown overnight at 37°C 
with shaking. The next day, one ml of overnight culture was transferred into one liter of LB 
media with 100 |xg/ml carbenicillin. Bacteria were grown until to an OD 6 2o of 0.7, then 150 

15 uM EPTG (final) was added to the medium. Bacteria were grown at 37°C for one to two 

hours more, then shifted to room temperature and grown overnight with shaking. Cells were 
harvested by centrifugation; bacterial pellets were resuspended in PBS buffer and lysed using 
a French Press. Soluble and insoluble fractions were separated by centrifugation for thirty 
minutes at 10,000 RPM in a Sorvall, SS 34 rotor at 4°C. 

20 Purification of the inclusion bodies 

[0221] Fifty ml of Novagen's Wash buffer (20 mM Tris.HCl, pH 7.5, 1 0 mM EDTA, 1 % 
Triton X-100) was added to the insoluble fraction, i.e., the inclusion bodies (IB's). The 
insoluble fraction was vortex ed to resuspend the pellet. The suspended EB's were centrifuged 
and washed at least twice by resuspending in Wash Buffer as above. Clean precipitates 

25 (IB's) were recovered and were stored at -20 °C until use. 

Refolding inclusion bodies 

[0222] The IB's were weighed (144 mg) and dissolved in Genotech IBS buffer (1 .44 ml). 
The resuspended IB's were incubated at 4 °C for one hour in an Eppendorf centrifuge tube. 
Insoluble material was removed by centrifugation at maximum speed in an Eppendorf 
30 centrifuge. Solubilized IB's were diluted to 4 ml final volume. Refolding of GST-ST3Gaim 
was tested in refolding buffer solutions containing cyclodextrin, polyethylene glycol (PEG), 
ND SB-201, or a GSH/GSSG redox system. One ml of solubilized IB's were diluted rapidly 
by pipetting into the refolding solution, vigorously mixed for 30-40 seconds, and then gently 
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stirred for two hours at 4 °C. Three ml aliquots of the refolded GST-ST3Gaim solutions 
were dialyzed against cold PBS buffer or a buffer containing 50 mM Tris.HCL, pH 7.0; 100 
mM NaCL; and 1 % glycerol using Pierce Slide-A-lyzers (MWCO:3.5 kDa,). After dialysis, 
the GST-ST3Galin solutions were concentrated 3, 6 and 12 fold using Vivaspin 5 K 
5 (VivaScience) concentrators in Jouan centrifuge at 4,000 rpm at 4°C. 

[0223] After refolding and dialysis, the refolded GST-ST3Gaim proteins were analyzed by 
SDS-polyacrylamide gel electrophoresis. The GST-ST3GalIII fusion, with a molecular 
weight of about 63-64 kDa, was present under all refolding conditions. (Data not shown.) 

Sialylation of oligosaccharides using refolded GST-ST3 Gal III 
10 [0224] Enzymatic assays using oligosaccharide substrates were carried out using CE-LIF 
(Capillary Electrophoresis-Laser Induced Fluorescence). Refolded ST3 Gal HI enzymes 
were assayed for ability to transfer of sialic acid from CMP-NAN (cytidine 5 -Monophosphate 
-p-D-sialic acid) to LNnT-APTS (Lacto-iV-Neotetraose-9-aminopyrene 1-4, 6 trisulfonic 
acid) to form LSTd-APTS ( Lactosialic-Tetrasaccharide- d-APTS). Reactions were 
15 performed in 96 well microtiter plates in 100 /il of a buffer containing 20 mM MOPS, pH 6.5; 
0.8 mM CMP-NAN; 22.1 mM LNnT; 25 /iM LNnT-APTS; 2.5 mM MnCl 2 . Reactions were 
started by addition of 20 ]i\ of refolded ST3 Gal m at 30 °C for thirty minutes. Reactions 
were quenched with a 1 to 25 dilution with water. The diluted reaction was analyzed by CE- 
LIF using an N-CHO coated capillary according to manufacturer's guide. Activities were 
20 calculated as the ratio of the normalized peak areas of LNnT-APTS to LSTd-APTS. Results 
comparing different refolding conditions are shown in Table 2. Two additional experiments 
using the GSH/GSSG system are shown in Table 3. 

Table 2. GST-ST3-Gal m activities after screening different folding systems. The proteins 
were assayed directly without concentration. 

25 

Cvclodextrin PEG ND SB-201 GSH/GSSG 

0 0 0 7.8 U/L* 

♦Activities reported here are Units per L refolded enzyme. 

30 
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Table 3. GST-ST3GalIII activities after two separate folding experiments using GSH/GSSG 
system. 



GSH/GSSG 



Cone 



Activity 



5 



Refolding Trial 1 



12x 



182 U/L* 



Refolding Trial 2 



40x 



531 U/L* 



10 * Activities reported here are Units per L refolded enzyme 

Sialylation of glycoproteins using refolded GST-ST3 Gal III 

[0225] Twenty uL of asialylated Transferrin (2ug/uL) or asialylated Factor DC (2ug/uL), 
was added to fifty pL of a buffer containing 50mM Tris, pH 8.0; and 150 mM NaCl, with 10 

15 uL of 100 mM MnCl 2 ; 10 uL of 200mM CMP-NAN; and 0.05% sodium azide. The reaction 
mixture was incubated with 30 uL refolded GST-ST3Gaim at 30°C overnight or longer with 
shaking at 250 rpm. After the reactions were stopped, the sialylated proteins were separated 
on pH 7-3 IEF (Isoelectric focusing gel, Invitrogen) and stained with Comassie Blue 
according to manufacturer's guideline. Both Transferrin and Factor IX were sialylated by 

20 GST-ST3Gaim. (Data not shown.) 

Refolding a rat liver ST3GalIII fused to an MBP tag. 
[0226] Rat liver ST3Gaim was cloned into pMAL-c2x vector and expressed as a maltose 
binding protein (MBP) fusion, MBP-ST3 Gaffil, in inclusion bodies oiE.coli TBI cells. The 
refolded MBP-ST3GalHI was active and transferred sialic acid to LNnT, a sugar substrate, 
25 and to asialylated glycoproteins, for example asialo-transferrin. 

Cloning ST3GalIII into pMAL-c2x vector 

[0227] The rat liver ST3-GalD3 nucleic acid was cloned into BamHl and Xbal sites of the 
pMAL-c2x vector after PCR Amplification using the following primers: 



[0228] Nucleotides encoding amino acids 28-374, e.g., the stem region and catalytic 
domain of ST3Gaini, were fused to the MBP amino acid tag. 

[0229] Three other truncations of ST3Galffl were constructed and fused to MBP. The 
35 three ST3Gal III (A73, A85, A85) inserts were isolated by PCR using the following 5' 

primers (ST3 BamHl A73) TGTATCGGATCCCTGGCCACCAAGTACGCTAACTT; (ST3 
BamHl A85) TGTATCGGATCCTGCAAACCCGGCTACGCTTCAGCCAT; and (ST3 



30 



Sense ST3BAMH1 
Antisense: ST3XBA1 



5 '-TAATGGATTCAAGCTACACTTACTCCAATGG 
5 ' -GCGCTCTAGATC AGAT ACC ACTGCTT AAGT 
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BamHl A86) TGTATCGGATCCAAACCCGGCTACGCTTCAGCCAT) respectively, in 
pairs with the common 3' primer (ST3-Xhol- 

GGTCTCCTCGAGTCAGATACCACTGCTTAA). Each PCR product was digested with 
BamHI and Xhol, subcloned into BamHI-XhoI digested pCWin2-MBP Kanr vector, 
transformed into TBI cells, and screened for the correct construct. 

[0230] PCR reactions were carried out under the following conditions. One cycle at 95°C 
for 1 minute. One fi\ vent polymerase was added. Ten of the following cycles were 
performed: 94°C for 1 minute; 65°C for 1 minute; and 72°C for 1 minute. After a final ten 
minutes at 72°C, the reaction was cooled to 4°C. 

[0231] All of the ST3Gaim truncations had activity after refolding. The experiments 
described below were performed using the MBP A73ST3GalIII truncation. 

Expression of MBP-ST3GalIII in E. coli TBI cells 

[0232] The pMAL-ST3GalIII plasmid was transformed into chemically competent E. coli 
TBI cells. Three isolated colonies containing TBl/pMAL-ST3Gaim construct were picked 
from the LB agar plates. The colonies were grown in five ml of LB media supplemented 
with 60 ng/ml carbenicillin at 37°C with shaking until the liquid cultures reached an OD 62 o of 
0.7. Two one ml aliquots were withdrawn from each culture and used to inoculate fresh 
media with or without 500 uM IPTG (final). The cultures were grown at 37°C for two hours. 
Bacterial cells were harvested by centrifugation. Total cell lysates were prepared heating the 
cell pellets in the presence of SDS and DTT. IPTG induced expression of MBP-ST3GalDI. 
(Data not shown.) 

Expression of MBP-ST3GalIII and Purification of the inclusion bodies: 
[0233] A one ml aliquot of TB l/pMAL-ST3Galin overnight culture was inoculated into 
0.5 liter of LB media with 50 ug/ml carbenicillin and grown to an OD620 of 0.7. Expression 
of MBP-ST3Gaim was induced by addition of 0.5 mM IPTG, followed by overnight 
incubation at room temperature. The next day bacterial cells were harvested by 
centrifugation. Cell pellets were resuspended in a buffer containing 75 mM TrisHCl, pH 7.4; 
100 mM NaCl; and 1 % glycerol. Bacterial cells were lyzed using a French Press. Soluble 
and insoluble fractions were separated by centrifugation for thirty minutes, 4°C, 10,000 rpm, 
Sorvall, SS 34 rotor). Soluble and insoluble fractions were separated by centrifugation for 
thirty minutes at 10,000RPM in a Sorvall, SS 34 rotor at 4°C. 
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Purification of the inclusion bodies and refolding of MBP-ST3GalIII using GSH/GSSG 
[0234] The MBP-ST3Galin inclusion bodies were purified and suspended using the same 
methods and buffers used for the GST-ST3Galin fusion proteins described above. The 
MBP-ST3GalIII were refolded using the GSH/GSSG system described above. The refolded 
5 MBP-ST3GalIII enzymes were dialyzed against cold 65 mM Tris.HCL pH 7.5, 1 00 mM 

NaCl, 1 % glycerol using Pierce SnakeSkin Dialysis bag (MWCO:7 kDa). The refolded and 
dialyzed MBP-ST3Gaim were concentrated from 3-14 fold using Vivaspin 5 K 
(VivaScience) concentrators in Jouan centrifuge at 4,000 rpm at 4°C. The refolded MBP- 
ST3Galin proteins were analyzed by SDS-Polyacrylamide gel electrophoresis. An 81 kDa 
10 MBP-ST3Gaim was detected. (Data not shown.) 

MBP-ST3 Gal III enzymatic activity assays 

[0235] Refolded MBP-ST3 Gal HI enzymes were assayed for ability to transfer sialic acid 
from CMP-NAN to LNnT-APTS to form LSTd-APTS, as described above. The refolded 
MBP-ST3 Gal HI enzymes were active and transferred sialic acid to LNnT-APTS to form 
15 LSTd-APTS. (Data not shown.) 

[0236] Refolded MBP-ST3 Gal in enzymes were assayed for ability to transfer sialic acid 
from CMP-NAN to glycoproteins. Transfer of sialic acid to asialo-Transferrin was assayed 
as described above, for GST-ST3-Gaim enzymes. The refolded MBP-ST3 Gal m enzymes 
were active and transferred sialic acid to asialo-Transferrin. (Data not shown.) 

20 Additional assays of conditions for refolding MBP-ST3GalIII 

[0237] MBP-ST3Gaim was refolded using the conditions shown in Figure 1 . The buffer, 
redox couple and detergent (if used) were mixed before addition of solubilized IB's to start 
the refolding reaction. IB's were diluted 1/20. MBP-ST3GaHH refolding was also successful 
using with different redox couples, for example Cystamine2 HCl/Cysteine at molar ratios of 

25 74, 4/1 , 1/10, or 5/5. (Data not shown.) 

ST3 Gal III enzymatic activity assays 

[0238] Refolded MBP-ST3 Gal in enzymes were assayed for ability to transfer sialic acid 
from CMP-NAN to LNnT-APTS to form LSTd-APTS, as described above. Results are 
shown in Figure 1 . The highest refolded MBP-ST3 Gal HI activities were seen using 
30 conditions, 8, 1 1, 13 and 16. When refolding was scaled up to five ml, MBP-ST3 Gal HI 
proteins refolded using conditions 8 and 16 had the highest activity. (See, e.g., Table 4.) 
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Table 4. 



Condition 



U/ L folded protein 



U/g IB's 



5 



8 
6 



70 
50 



37.0 
40.5 



10 Purification of MBP-ST3GalIII on amylose column 

[0239] Refolded MBP-ST3GalIII proteins from the 5 ml refolding preperation were 
combined and dialyzed against 100 mM TrisHCl pH 7.4, 100 mM NaCl and 1 % glycerol. 
The refolded MBP-ST3Gaim proteins were applied to an amylose column. Most of the 
refolded MBP-ST3GalIH protein was bound to the amylose column and eluted with 10 mM 

15 maltose. An elution profile is shown in Figure 2. Enzymatic activity of the MBP-ST3GalIII 
fractions was determined using the LnNT assay and is shown in Figure 3. 

GlycoPEGYlation of asialotransferrin with refolded MBP-ST3GalIII: 
[0240] Asialo-transferrin (2 mg/ml) was incubated with purified fractions of refolded 100 
/il of MBP-ST3Gaim in the presence of CMP-SA-PEG (10 kDa, 1.6 mM) or CMP-SA-PEG 
20 (20 kDa, 1.06 mM) in 230 fil reaction. GlycoPEgylation reactions were carried out at 30°C 
overnight or for three days. Aliquots were withdrawn from the reactions and analyzed on 4- 
20 % SDS-polyacrylamide gel. Results are shown in Figure 4. Purified, refolded MBP- 
ST3Gaim transfers 10 or 20 K PEGylated sialic acids to asialo-transferrin. 

Large scale MBP-ST3GalHI refolding 
25 [0241] The following method was used to make large scale refolded MBP-ST3GalIQ. 

[0242] Wet IB's (470 mg) were dissolved IB solubilization Buffer (13 ml) in 15 ml culture 
tube. IB solubilization buffer includes the following: 4 M Guanidine HC1; 100 mM TrisHCl, 
pH 9; and 100 mM NaCl. IB's were incubated in IB solubilization buffer at 4°C for about 1 
hour with gentle shaking. Any insoluble material was removed by centrifugation in 1.5 mL 
30 Eppendorf tubes, at 4°C at max speed, for 30 minutes. The solubilized IB's were transferred 
to clean tubes and protein concentration was determined using absorbance at 280 nm. 

[0243] The following refolding solution was prepared and kept at 4°C: 55 mM MES 
buffer, pH 6.5; 264 mM NaCl; 1 1 mM KC1; 0.055 % PEG 550; 550 mM Arginine. The 
buffer was supplemented with 0.3 mM Lauryl maltoside (LM); 0. 1 mM oxidized glutathione 
35 (GSSG); 1 mM reduced glutathione (GSH) immediately before the addition of solubilized 
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IB's. Two ml of solubilized IB's were added into 43 ml of refolding buffer in 50 ml sterile 
culture tube. The tube was placed on a rocker-shaker and gently shaken for 24 hours at 4°C. 
The refolded protein was dialyzed in dialysis tubing ( MWCO: 7 kD) against Dialysis Buffer 
(100 mM Tris HC1, pH 7.5; 100 mM NaCl; and 5 % glycerol) twice (in 10-20 volume excess 
5 buffer). 

[0244] The large scale dialyzed, refolded MBP-Gal III was analyzed for ST3GalITI activity, 
and exhibited about 53.6 U/g IB. 

Example 2: Site Directed Mutagenesis of Human GnTI to Enhance Refolding. 

[0245] A truncated human N-acetylglucosaminyltransferase I was expressed in E.coli as a 

10 maltose binding fusion protein (GnTI/MBP). The fusion protein was insoluble and was 
expressed in inclusion bodies. After solubilization and refolding, the GnTI/MBP fusion 
protein had low activity. The crystal structure of a truncated form of rabbit GnTI (105 amino 
terminal amino acids deleted) shows an unpaired cysteine residue (CYS123) near the active 
site. (See, e.g., Unligil et al, EMBOJ. 19:5269-5280 (2000)). The corresponding unpaired 

1 5 cysteine in the human GnTI was identified as CYS 121 and was replaced with a series of 
amino acids that are similar in size and chemical characteristics. The amino acids used 
include serine (Ser), threonine (Thr), alanine (Ala) and aspartic acid (Asp). In addition, a 
double mutant, ARG120ALA, CYS121HIS, was also made. The mutant GnTI/MBP fusion 
proteins were expressed in E. coli, refolded and assayed for GnTI activity towards 

20 glycoproteins. 

[0246] Mutagenesis was done using a Quick Change Site-Directed Mutagenesis Kit from 
Stratagene. Additional restriction sites were introduced with some of the GnTI mutations. 
For example an Apal site (underlined, GGGCCCAC) was introduced into the GnTI 
ARG120ALA, CYS121HIS mutant, i.e., CGC CTG -» GCC CAC (changes in bold). The 

25 following mutagenic oligonucleotides were used to make the double mutant: GnTI R120A, 
C121H+, 5'CCGCAGCACTGTTCGGGCCCACCTGGACAAGCTGCTG 3'; and GnTI 
R120A, C121H- 5'CAGCAGCTTGTCCAGGTGGGCCCGAACAGTGCTGCGG 3' 
(changes shown in bold). An Ascl site (underlined, GGCGCGCC) was introduced into the 
GnTI CYS 121 ALA mutant, i.e., CTG -> GCC (changes in bold). The following mutagenic 

30 oligonucleotides were used to make the GnTI CYS121 ALA mutant: GnTlC123A+ 
5'AGCACTGTTCGGCGCGCCCTGGACAAGCTGCTG 3; and GnTlC123A- 
5'CAGCAGCTTGTCCAGGGCGCGCCGAACAGTGCT 3' 
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[0247] The activity of the mutant proteins expressed in E. coli was compared to the activity 
of wild type GnTl expressed in baculo virus. A CYS121SER GNTI mutant was active in a 
TLC based assay. In contrast, a CYS121THR mutant had no detectable activity and a 
CYS121ASP mutant had low activity. A CYS121 ALA mutant was very active, and a double 
5 mutant, ARG120ALA, CYS121HIS, based on the amino acid sequence of the C. elegans 

GnTl protein (Glyl4), also exhibited activity, including transfer of GlcNAc to glycoproteins. 
Amino acid and encoding nucleic acid sequences of the GnTl mutants are provided in 
Figures 7-11. 

Example 3: MPB fusions to GalTl. 

10 [0248] The following fusions between truncated bovine GalTl and MBP were constructed: 
MBP-GalTl (D129) wt, (D70) wt or (D129 C342T). (For the full length bovine sequence, 
see, e.g., D'Agostaro et al., Eur. J. Biochem. 183:21 1-217 (1989) and accession number 
CAA32695.) Each construct had activity after refolding. The MBP-GalTl(D129 C342T) 
was more efficient in downstream processing and was used in the experiments described 

15 below. 

Example 4: One Pot Method o f Refolding Multiple Glvcosvltransferases. 
[0249] Eukaryotic ST3Gaim, GalTl, and GnTl enzymes build N-glycan chains on 
glycoproteins. Additional modifications, for example GlycoPEGylation, can be performed 
using CMP-NAN-PEG as a donor substrate. Eukaryotic ST3Gaim, GalTl, and GnTl 
20 enzymes are typically expressed in eukaryotic expression systems, for example fungal or 
mammalian cells. 

[0250] Eukaryotic ST3GalHI, GalTl , and GnTl enzymes each fused to a maltose binding 
protein (MBP) domain were solubilized, combined, and refolded together in a single vessel. 
The MBP fused and refolded enzymes were active and were used to add N-glycans to 

25 glycoproteins or to glycoPEGylate glycoproteins. The refolding buffer included a redox 
couple, for example, glutathione oxidized/reduced (GSH/GSSG). Refolding was enhanced 
by addition of arginine and polyethylene glycol 3350 (PEG). The IB's can be solubilized 
individually and added to refolding buffer in different proportions or solubilized together 
from IB's and added to the refolding buffer directly. The one step purification or 

30 immobilization of these enzymes can also be done using the MBP fusion tag. 
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Preparation of a refolded glycosyltransferase mixture (SuperGlycoMix) 
Preparation of the glycosyltransferases IB's 

[0251] Bacterial strains used to produce eukaryotic ST3GalHI, GalTl, and GnTl enzymes 
are shown in Table 5. The table also shows the estimated molecular weight of the MBP 
5 fusion proteins. (MW based on amino acid composition, Vector NTI software.) All nucleic 
acids encoding the eukaryotic enzymes were expressed from IPTG inducible expression 
vectors. 

Table 5 

10 Strain/Construct Protein expressed (IBS's) MW (kD) 

JM109/pCWori-MBP-GaTl (A129) C342T MBP-GalTl(A129) C342T 74^2 

JM109/pCWIN2-MBP-GnTl (A103) C121A MBP-GnTl(A103) C121A 82.4 

15 

TB l/pMAL-ST3Galin MBP-ST3Gaim 82 

[0252] Following EPTG induction of E. coli cultures, IB's containing GnTl, GalTl and 
ST3GalIII enzymes isolated by lysing the cells using a French Press or detergent lysis 
20 (Novagen's Bugbuster Reagent). Pellets were recovered after centrifugation and processed to 
obtain IB's, as described previously. IB's were washed at least two times using Novagen's 
IB wash buffer. Washed IB's were stored at -20°C until they are ready to use in refolding 
experiments 

[0253] IB's containing ST3GalIII, GalT 1 , or GnT 1 were separately dissolved in a buffer 
25 containing 6 M Guanidine HC1, 50 mM TrisHCl pH 8.0, 5 mM EDTA, 10 mM DTT at 4°C 
for one hour. Cleared supernatants were obtained after centrifugation (Max speed at 
Eppendorf Micro-centrifuge). The protein content of the solubilized IB's was determined by 
measuring absorbance at 280 nm. The protein contents in Table 6 were determined based on 
the extinction coefficients of each MBP-Glycosyltransferase. The extinction coefficients 
30 were calculated using Vector NTi software (See Table 5) 
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Table 6. Protein concentrations in solubilized IB's. 

Protein A280 at 1 me/ml mg/ml 

5 MBP-ST3Galin 1.49 4.23 

MBP-GalTl(A129) C342T 1.39 6.80 

MBP-GnTl(A103)C121A 1.7 3.29 

10 

One pot refolding of Glycosyltyransferases 

[0254] Solubilized IB's were mixed at equal amounts, as shown in Table 7. 
Table 7. Solubilized IB's were mixed at following amounts before refolding. 

15 

Protein V(mD mg % of total protein 

MBP-ST3Gaim 0.8 3.4 36 

MBP-GalTl(A129) C342T 0.5 3.4 36 

20 

MBP-GnTl(A103)C121A 0.8 2.6 28 

Total 2A 9A 100 

25 

[0255] The protein concentration of the total solubilized IB mixture was 4.5 mg/ml. The 
mixture was diluted approximately 1/20 in refolding buffer making the final concentration of 
the total protein mixture 0.22 mg/mL. Refolding buffer containing 55 mM MES, pH 6.5; 550 
mM Arginine; 0.055 % PEG3350; 264 mM NaCl; 11 mM KC1; 1 mM GSH; and 0.1 mM 
30 GSSG. Refolding can also be performed in a buffer with Tris HC1, pH 8.2 and a 

Cysteine/Cystamine redox couple can be substituted for GSH/GSSG. The IB mixture was 
diluted into the refolding buffer and incubated at 4°C overnight (16-18 hours). Estimated 
concentrations of the glycosyltransferases in refolding reaction: 

MBP-ST3Gaim 0.081 mg/mL 

35 MBP- GalTl (A129) C342T 0.081 mg/mL 

MBP-GnTl (A103) C121A 0.062 mg/mL 

[0256] After overnight refolding, the refolded glycosyltransferase mix was dialyzed to 
remove chaotropic agent {i.e. Guanidine HC1). Dialysis was carried out twice against 50 mM 
40 TrisHCl pH 8.0 at 4°C (20 fold per dialysis) in a dialysis bag (SnakeSkdn, MWCO: 7 kD, 
Pierce). The dialyzed refolded glycosyltransferase mix (Superglycomix, SGM) was 
concentrated six fold using VivaSpin 6 mL (MWCO: 10 kD) centrifugal concentrators. After 
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concentration, all three glycoproteins were present in the mixture, as determined by SDS- 
PAGE analysis. (Data not shown.) After concentrating the SGM, enzymatic activities of 
GnTl, GalTl, and ST3GalIII were determined. 

Enzymatic activities of SuperGlycoMix 
5 [0257] Superglycomix (SGM), the one pot refolded glycosyltransferase mix contains three 
glycosyltransferases: ST3GalIH, GalTl and GnTl. These enzymes were individually 
assayed for their enzymatic activities and analyzed using the methods indicated below. The 
enzymatic activities are listed in Table 8. 

ST3 Gal III enzymatic activity assays 

1 0 [0258] ST3GalIII assays were carried out using HPLC/UV (High Performance Liquid 
Chromatography with Ultraviolet Detection). The conversion of LNnT (Lacto-JV- 
Neotetraose) into LSTd (Lactosialic-Tetrasaccharide-d) using CMP-NAN (cytidine 5'- 
Monophosphate-P-D-sialic acid) by ST3Gaim enzyme was performed as follows. The 
reaction was carried out in a 96 well microtiter plate in 100 ul of 20 mM MOPS, pH 6.5 

1 5 buffer containing 2 mM CMP-NAN, 30 mM LNnT, 1 0 mM MnCl 2 and 20 ul of refolded 
enzyme at 30°C for 120 minutes. The reaction was quenched by heating to 98°C for 1 min. 
The microtiter plate was centrifuged at 3600 rpm for 10 min to pellet any precipitate. 75 ul 
of supernatant was diluted 1 : 1 with 75 ul of water. The diluted reaction was analyzed by 
LC/UV using a YMC-Pack Polyamine II column with a sodium phosphate buffer/acetonitrile 

20 gradient and detection at 200 nm. The sample product peak area was compared to an LSTd 
calibration curve, and the activity was calculated based on the amount of LSTd produced per 
min per ul of enzyme in the reaction. 

GalTl enzymatic activity assays: 

[0259] The enzymatic assays were carried out using HPLC/PAD (High Performance Liquid 
25 Chromatography with Pulsed Amperometric Detection). The conversion of LNT2 (Lacto-A 71 
Triose-2) into LNnT (Lacto-AT-Neotetraose) using UDP-Gal (Uridine 5'-Diphosphogalactose) 
by GalTl enzyme was performed as follows. The reaction was carried out in 100 ul of 50 
mM Hepes, pH 7 buffer containing 6 mM UDP-Gal, 5 mM LNT-2, 5 mM MnCl 2 and 100 pi 
of refolded enzyme at 37°C for 60 minutes. The reaction was quenched (1 to 10 dilution) 
30 with water and centrifuged through a 10,000 MWCO spin filter. The filtrate was then diluted 
1 to 10. This diluted reaction was analyzed by HPLC using a Dionex DX-500 system and a 
CarboPac PA1 column with sodium hydroxide buffer. The sample product peak area was 
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compared to an LNnT calibration curve, and the activity was calculated based on the amount 
of LNnT produced per min per fil of enzyme in the reaction. 

GnTI enzymatic activity assays: 

[0260] The activity of GnTI is determined by measuring the transfer of a tritiated sugar 
5 from UDP- 3 H-GlcNAc (Uridine diphosphate N-acetyl-D-glucosamine [6- 3 H(N)]) to n-octyl 
3,6-Di-0-(a-mannopyranosyl) P-D-mannopyranoside (OM3), a trimannosyl core with an 
octyl tail. The reaction was carried out in 20 ul of 100 mM MES, pH 6.0 buffer containing 3 
mM UDP-GlcNAc, 0.1 mM UDP- 3 H-GlcNAc, 0.5 mM OM3, 20 mM MnCl 2 and 10 ^xl of 
refolded enzyme at 37°C for 60 minutes. The reaction was quenched (1 to 6 dilution) with 

10 water and applied to a polymeric reversed-phase resin in a 96 well format that was previously 
conditioned according to the manufacturer's recommendations. The resin was washed twice 
with 200 ul of water and the product was eluted with 50 ul of 100% MeOH into a capture 
plate. Scintillation fluid (200 uL) was added to each well and the plate was mixed and 
counted using a PerkinElmer TopCount NXT microplate scintillation counter. The activity 

1 5 was calculated based on the amount of 3 H-GlcNAc incorporated into the product per min per 
ul of enzyme in the reaction. 

Table 8. Enzymatic activities of refolded Glycosyltransferases in SGM 

Enzymatic activity mU/mL 

20 

GnTI 1 

GalTl 165 

25 ST3Gaim 10 

[0261] The activities reported in the table above are close or in the range when these 
enzymes were refolded separately. GnTI and GalTl activities are close to those obtained 
using mammalian or baculovirus expression systems. ST3Gaim activities are somewhat 
30 lower than in ST3Gaim preparation obtained after fungal expression system. The ST3Gaim 
assay used here is modified from the procedure and values reported here approximately 4-5 
fold lower than those obtained a method based on CE-LIF (Capillary electrophoresis-Laser 
induced Fluorescence). 
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Remodeling RNAseB-M an 5 using Superglycomix 

[0262] A small glycoprotein, RNAseB with one N linked Man5 sugar, was remodeled by 
SGM in the presence of UDP-sugars (UDP-GlcNAc and UDP-Gal). The remodeling reaction 
was carried out either using UDP-GlcNAc or both UDP-GlcNAc and UDP-Gal to test the 
5 both GnTl and GalTl activities. Eight ul of SGM was added to 10 mM MES buffer pH 6.5 
containing 5 mM UDP-GlcNAc, or/and 5 mM UDP-Gal, 9 ug RNAseBMan 5 , 5 mM MnCl 2 
in 25 ul assay incubated at 33°C for overnight to 48 hours. At the end of the reaction, ten jj.1 
aliquots were dialyzed against H 2 0 and 1.5 pi samples were spotted on MALDI-TOF plates. 
Samples were analyzed on MALDI-TOF after being treated with TFA and cinnapinic acid. 

10 [0263] The remodeling of RNAseBman5 was done by transferring GlcNAc and Gal on 
Man5 of the RNaseB. After 48 hrs incubation at 33°C, majority of the GlcNAc and Gal 
transfer onto RNAseB was accomplished as indicated in MALDI-TOF spectra of the 
remodeled RNAseBMan 5 . Results are summarized in Table 9. 

Table 9. MALDI-TOF Spectra of the species after SGM reactions. 

15 

m/z 
RNAseB 

20 Reaction Man 5 Man 5 -GlcNAc Man 5 GlcNAc-Gal 

No Enzyme ~ 14983 - 

SGM+ UDP-GlcNAc 14973 15177 

25 

SGM+ UDP-GlcNAc +UDP-Gal 14982 15170 15348 

GlycoPEGylation EPO remodeling using SGM 

[0264] GlycoPEGylation (20 K) was carried out in one pot reaction composed of the 
30 following components: 10 mM MES pH 6.5, 5 mM MgCL 2 , 5 mM UDP-GlcNAc, 5 mM 
UDP-GalNAc, 0.5 mM CMP- SA-PEG (20 kDa), 24 ug EPO, 8 uL concentrated SGM. In 
control reactions, SGM was replaced by individual enzymes either refolded or expressed in 
mammalian cells or insect cells or Aspergillus. After overnight incubations, the reactions 
were analyzed on SDS-polyacrylamide gel. Results are shown in Figure 5. SGM added 20K 
35 PEG to EPO. 
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Assessment of one pot refolding conditions for multiple glycosyltransferases 
[0265] Conditions for refolding multiple glycosyltranferases were assessed, including pH 
and refolding two or three enzymes at once. 

Preparation of glycosyltransferase inclusion bodies 
5 [0266] E. coli strains transformed with glycosyltransferase expression plasmids were 
described previously, with one exception. MBP-ST3GalIH was expressed in JM109 cells 
from a pCWori-ST3GalIH plasmid. The inclusion bodies were isolated and solubilized as 
described above. Protein contents were assessed as described above and are shown in Table 
10. 

10 Table 10. Solubilized IB's were mixed at following amounts before refolding. 

Protein A280 A280 (at 1 mg/ml) mg % (of sol, protein) 

MBP-ST3Gaim 32.3 1.49 21.7 13.6 

MBP-GalTl(A129) C342T 35.7 1.39 25.7 13.7 

15 MBP-GnTl(A103) C121S 42.8 1.7 25.2 9.7 

One pot refolding of Glycosyltyransferase IB mixtures 

[0267] After determining their protein contents, solubilized IB ' s were mixed at amounts 
shown before diluted in the refolding buffers (Table 11). Refolding experiments of the GT's 
20 were carried out in 44 ml volume at 4°C at stationary phase using buffer A or B (below) and 
0.1 mM GSSG and 1 mM GSH. Buffer A: 55 mM MES pH 6.5, 550 mM Arginine, 0.055 % 
PEG3350, 264 mM NaCl, 1 1 mM KCl, supplemented with 1 mM GSH, 0.1 mM GSSG. 
Buffer B: 55 mM TrisHCl pH 8, 550 mM Arginine, 0.055 % PEG3350, 264 mM NaCl, 1 1 
mM KC1, supplemented with 1 mM GSH, 0.1 mM GSSG. 

25 
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Table 11. Mixing amounts of solubilized GT IB's in 2 mL IBSB 

Refolding in Buffer A 



Refold 1 ( A-2x) 


Conc(mg/mL) 


V(mL) 


me 


MBP-GnTl (A103)C121S 


25.2 


0.2 


5 


MBP- GalTl (A129) C342T 


25.7 


0.2 


5 


IBSB 




1.6 




Refold 2 (A-3x) 


Conc(me/mD 


V(mL) 


mg 


MBP-GnTl (A103) C121S 


25.2 


0.2 


5 


MBP- GalTl (A129) C342T 


25.7 


0.2 


5 


MBP-ST3Gaini 


21.7 


0.4 


8.7 


IBSB 




1.2 





Refolding in Buffer B 

15 



20 



Refold 3 (B-2x) 


Concfme/mL) 


V(mD 


me 


MBP-GnTl (A103) C121S 


25.2 


0.2 


5 


MBP- GalTl (A129) C342T 


25.7 


0.2 


5 


IBSB 




1.4 




Refold 4 fB-3x) 


Conc(mg/mL) 


VfmU 


me 


MBP-GnTl (A103) C121S 


25.2 


0.2 


5 


MBP- GalTl (A129) C342T 


25.7 


0.2 


5 


MBP-ST3Galin 


21.7 


0.4 


8.7 


IBSB 




1.2 





[0268] For double refolding (2x, two glycosyltranferases) 10 mg total protein in 2 ml was 
added into 41 mL refolding buffer (above) 0.45 mL 100 mM GSH, 0.45 mL 10 mM GSSG, 
after dilution total protein was 0.44 mg/ml. For triple refolding (3x, three 
30 glycosyltransferases) 18.7 mg total protein in 2 ml was added into 41 mL refolding buffer 

(above), 0.45 mL 100 mM GSH, 0.45 mL 10 mM GSSG. After dilution total protein was 0.83 
mg/ml. The protein concentrations were higher than previous triple refolding experiment 
(0.22 mg/ml in SGM). Estimated concentrations of the glycosyltransferases in refolding 
reaction follow: 

35 MBP-ST3GalDI 0.39 mg/mL 

MBP- GalTl (A129) C342T 0.23 mg/mL 

MBP-GnTl (A103) C121S 0.23 mg/mL 

[0269] After overnight refolding, the refolded glycosyltransferase mix was dialyzed. 
Dialysis was carried out twice against 50 mM TrisHCl pH 8.0 at 4°C in a dialysis bag 



69 



(SnakeSkin, MWCO: 7 kD, Pierce). After dialysis, the glycosyltransferase mix was 
concentrated 9-12 fold using 6 mL VIVA-Spin (MWCO: 10 K) centrifugal concentrators. 

[0270] SDS-PAGE analysis demonstrated that the proteins were present after refolding, 
dialysis, and concentration. 

5 Enzymatic assays of refolded glycosyltransferase mixtures 

[0271] Enzymatic assays were performed as described above. Results are shown in Table 
12. 

Table 12. Enzymatic activities of refolded Glycosyltransferases after double and triple 
refolding experiments. 

10 



Folding Fold cone 


Enzymatic activity 


mU/mL 


Buffer A (A-2x) 


GnTl 


0.84 


GalTl 


598 


Buffer A (A-3x) 


GnTl 


0.16 


GalTl 


306 




ST3Galin 


4 


Buffer B (B-2x) 


GnTl 


3.32 




GalTl 


747 


Buffer B (B-3x) 


GnTl 


0.47 




GalTl 


425 




ST3Gaim 


11 



[0272] The highest activity was seen on mixing MBP fused GnTl and GalTl in equal 
amounts and refolded in buffer B. Adding non-equivalent amount of MBP-fused ST3Galin 
affected refolding efficiency due to total high protein. Nevertheless, two different refolding 
30 buffer using either two GT's or three GT's, can be used to obtain active soluble proteins. 

Example 5: Refoldine eukarvo tic GalNAcT2. 

[0273] A truncated human GalNAcT2 enzyme was expressed in E. coli and used to 
determine optimal conditions for solubilization and refolding using the methods described 
above. The full length human GalNAcT2 nucleic acid and amino acid sequences are 
35 provided in Figures 13 A and B. The sequences of the mutant protein, GalNAcT2(D5 1), are 
shown in Figures 14A and B. The mutant was expressed in E. coli as an MBP fusion protein, 
MBP-GalNAcT2(D5 1 ). 
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[0274] Cultures of bacteria expressing MBP-GalNAcT2(D5 1 ) were grown and harvested as 
described above. Inclusion bodies were purified from bacteria as described above. 
Solubilization of the inclusion bodies was performed at pH 6.5 or at pH 8.0. After 
solubilization, MBP-GalNAcT2(D5 1) protein was refolded at either pH 6.5 or pH 8.0 using 
5 buffers A and B, i.e., Buffer A: 55 mM MES pH 6.5, 550 mM Arginine, 0.055 % PEG3350, 
264 mM NaCl, 1 1 mM KC1, supplemented with 1 mM GSH, 0.1 mM GSSG; and Buffer B: 
55 mM TrisHCl pH 8, 550 mM Arginine, 0.055 % PEG3350, 264 mM NaCl, 1 1 mM KC1, 
supplemented with 1 mM GSH, 0.1 mM GSSG. After refolding, MBP-GalNAcT2(D51) 
protein was dialyzed and then concentrated. Figure 15 provides a demonstration of the 
1 0 protein concentration of refolded MBP-GalNAcT2(D5 1 ) after solubilization at pH 6. 5 or pH 
8.0 and refolding at pH 6.5 or pH 8.0. 

[0275] A radiolabeled [ 3 H]-UDP-GalNAc assay was performed to determine the activity of 
the .E.co/i-expressed refolded MBP-GalNAcT2(D51) by monitoring the addition of 
radiolabeled GalNAc to a peptide acceptor. The acceptor was a MuC-2 - like peptide having 

1 5 the sequence MVTPTPTPTC). The peptide was dissolved in 1M Tris-HCl pH=8.0. See, 
e.g., USSN 60/576,530 filed June 3, 2004; and US provisional patent application Attorney 
Docket Number 040853-01-5 149-P1, filed August 3, 2004; both of which are herein 
incorporated by reference for all purposes. Figure 16 provides a demonstration of the 
enzymatic activity of refolded MBP-GalNAcT2(D51) after solubilization at pH 6.5 or pH 8.0 

20 and refolding at pH 6.5 or pH 8.0. Figure 17 provides a demonstration of the specific activity 
of refolded MBP-GalNAcT2(D51) after solubilization at pH 6.5 or pH 8.0 and refolding at 
pH 6.5 or pH 8.0. The highest activity levels were observed with MBP-GalNAcT2(D51) that 
had been solubilized at pH 8.0 and refolded at pH 8.0. The highest specific activity levels 
were also observed with solubilization at pH 8.0 and refolding at pH 8.0. 

25 [0276] Solubilized and refolded MBP-GalNAcT2(D5 1 ) was assayed for its ability to add 
GalNAc to the G-CSF protein. The assay consisted of an aliquot of enzyme and a reaction 
buffer (27mM MES, pH=7, 200mM NaCl, 20mM MgC12, 20mM MnC12, and 0.1% Tween 
80), G-CSF Protein (2mg/ml in H z O), and lOOmM UDP-GalNAc. For each refold sample, 
4.4uL of sample were added to 15uE of reaction solution. For the positive control, luL of 

30 standard GalNAcT2 Baculovirus was added along with 3.4uL of H 2 0 to one tube. Reactions 
were incubated at 32°C on a rotary shaker for several days, during which time an overnight 
time point and a 5 day time point were assayed by MALDI. See, e.g., USSN 60/576,530 
filed June 3, 2004; and US provisional patent application Attorney Docket Number 040853- 
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01-5149-P1, filed August 3, 2004; both of which are herein incorporated by reference for all 
purposes. 

[02771 Figures 1 8 A and 1 8B provide results of remodeling of recombinant granulocyte 
colony stimulating factor (GCSF) using refolded MBP-GalNAcT2(D51) after solubilization 
5 at pH 6.5 or pH 8.0 and refolding at pH 6.5 or pH 8.0. A positive control, i.e., purified MBP- 
GalNAcT2(D51) that had been expressed in baculo virus, and a negative control, i.e., reaction 
mixture lacking a substrate were included. The highest levels of GCSF remodeling activity 
were seen using MBP-GalNAcT2(D51) that had been solubilized at pH 8.0 and refolded at 
pH 8.0. 

10 Example 6; Refolding and purification of eukarvotic GalNAcT2. 

[0278] Four liters of bacteria that express recombinant MBP-GalNAcT2(D5 1 ) were grown 
and harvested. Inclusion bodies were isolated, washed, and two grams dry weight of 
inclusion bodies were solubilized at 4°C in 200mL of solubilization buffer (7M urea/ 50mM 
Tris/ lOmM DTT7 5mM EDTA at pH 8.0). After solubilization, the mixture was then diluted 

15 in to 4L of refolding buffer (50mM Tris/ 550mM L-Arginine/ 250mM NaCl/ lOmM KC1/ 
0.05% PEG 3350/ 4mM L-cysteine/ lmM cystamine dihydrochloride at pH 8.0). Refolding 
was carried out at 4-10°C for about 20 hours, with stirring. The mixture was then filtered 
using a 10SP CUNO filter, concentrated 5 fold on 4ft2 membrane, diafiltered 4 times with 
lOmM Tris/ 5mM NaCl at pH 8.0. The conductivity of the final refolded MBP- 

20 GalNAcT2(D5 1) solution was 1 .4 mS/cm. The refolded protein was stored at 4°C for several 
days. 

[0279] The refolded proteins were applied to a Q Sepharose XL (QXL) column (Amersham 
Biosciences, Piscataway, NJ). An elution profile is shown in Figure 19 and the enzymatic 
activity of specific column fractions are shown in Figure 20. The active fractions were 
25 combined and applied to an Hydroxyapatite Type I (80um) (BioRad, Hercules, CA) column. 
An elution profile is shown in Figure 21 and activity of HA type I eluted fractions is shown in 
Figure 22. The combination of QXL and HA type I chromatography resulted in active, 
highly purified MBP-GalNAcT2(D5 1). 

[0280] It is understood that the examples and embodiments described herein are for 
30 illustrative purposes only and that various modifications or changes in light thereof will be 
suggested to persons skilled in the art and are to be included within the spirit and purview of 
this application and scope of the appended claims. All publications, patents, and patent 
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applications cited herein are hereby incorporated by reference in their entirety for all 
purposes. 
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WHAT IS CLAIMED IS: 



1 1 . A recombinant eukaryotic N-acetylglucosaminyltransferase I (GnTI) 

2 enzyme, comprising the catalytic domain of the GnTI enzyme; 

3 wherein an unpaired cysteine residue is mutated, and 

4 wherein the GnTI enzyme catalyzes the transfer of a donor substrate to an 

5 acceptor substrate. 

1 2. The GnTI enzyme of claim 1, wherein the vertebrate GnTI enzyme is 

2 human. 

1 3. The GnTI enzyme of claim 2, comprising a CYS121 mutation, 

2 wherein the C YS 121 mutation is a member of the group consisting of a 

3 C YS 1 2 1 SER mutation, a C YS 1 2 1 ALA mutation, and a C YS 1 2 1 ASP mutation. 

1 4. The GnTI enzyme of claim 2, comprising an ARG120ALA, 

2 CYS 12 1HIS mutant. 

1 5. The GnTI enzyme of claim 1, wherein the GnTI enzyme further 

2 comprises an amino acid tag. ■ 

1 6. The GnTI enzyme of claim 5, wherein the amino acid tag is selected 

2 from the group consisting of a maltose binding protein (MBP), a polyhistidine tag, a 

3 glutathione S transferase (GST), a starch binding protein (SBP), and a myc epitope. 

1 7. The GnTI enzyme of claim 1 , wherein the GnTI enzyme comprises an 

2 amino acid sequence from figures 7-11. 

1 8. The GnTI enzyme of claim 7, further comprising a maltose binding 

2 domain. 

1 9. An isolated polynucleotide, the polynucleotide comprising a nucleic 

2 acid sequence that encodes a eukaryotic N-acetylglucosaminyltransferase I (GnTI) enzyme 

3 comprising a catalytic domain of the GnTI enzyme, 

4 wherein an unpaired cysteine residue is mutated, and 

5 wherein the GnTI enzyme catalyzes the transfer of a donor substrate to an 

6 acceptor substrate. 
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1 10. The GnTI enzyme of claim 9, wherein the GnTI enzyme is a human 

2 protein. 

1 11. The isolated polynucleotide of claim 10, wherein the GnTI enzyme 

2 comprises a CYS121 mutation, 

3 wherein the CYS 121 mutation is a member of the group consisting of a 

4 CYS121SER mutation, a CYS121ALA mutation, and a CYS121 ASP mutation. 

1 12. The isolated polynucleotide of claim 1 1 , comprising an ARG120ALA, 

2 CYS121HIS mutant. 

1 13. The isolated polynucleotide of claim 9, wherein the GnTI enzyme 

2 further comprises an amino acid tag. 

1 14. The GnTI enzyme of claim 13, wherein the amino acid tag is selected 

2 from the group consisting of a maltose binding protein (MBP), a polyhistidine tag, a 

3 glutathione S transferase (GST), a starch binding protein (SBP), and a myc epitope. 

1 15. The isolated polynucleotide of claim 9, wherein the GnTI enzyme 

2 comprises an amino acid sequence from Figures 7-11. 

1 1 6. An expression vector comprising the isolated polynucleotide of claim 

2 9. 

1 17. A host cell comprising the expression vector of claim 16. 

1 1 8. A method of producing a eukaryotic N-acetylglucosarninyltransferase I 

2 (GnTI) enzyme, the method comprising culturing a host cell of claim 17 under conditions 

3 suitable for the production of the GnTI enzyme. 

1 1 9. A method of adding an N-acetylglucosamine residue to an acceptor 

2 molecule comprising a tenninal mannose residue, the method comprising contacting the 

3 acceptor molecule with an activated N-acetylglucosamine molecule and a eukaryotic N- 

4 acetylglucosaminyltransferase I (GnTI) enzyme of claim 1. 

1 20. The method of claim 19, wherein the acceptor molecule is a 

2 glycoprotein. 
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1 2 1 . A method of refolding at least two insoluble, recombinant eukaryotic 

2 glycosyltransferase proteins in a single vessel, the method comprising 

3 contacting the glycosyltransferases with a refolding buffer under conditions 

4 suitable for refolding the enzymes, wherein the refolding buffer comprises a buffer and a 

5 redox couple, and wherein the refolded glycosyltransferases has biological activity. 

1 22. The method of claim 2 1 , wherein the refolding buffer further 

2 comprises arginine. 

1 23 . The method of claim 2 1 , wherein the refolding buffer further 

2 comprises PEG. 

1 24. The method of claim 21, wherein the glycosyltransferases further 

2 comprise an amino acid tag. 

1 25. The method of claim 24, wherein the amino acid tag is a member 

2 selected from the group consisting of a maltose binding protein (MBP), a polyhishdine tag, a 

3 glutathione S transferase (GST), a starch binding protein (SBP), and a myc epitope. 

1 26. The method of claim 21, wherein a first glycosyltransferase is a 

2 eukaryotic N-acetylglucosaminyltransferase I (GnTI). 

1 27. The method of claim 21, wherein a first glycosyltransferase is a 

2 eukaryotic N-acetylalactosaminyltransferase 2 (GalNAcT2). 

1 28. The method of claim 21, wherein the glycosyltransferases are part of 

2 an N-linked glycan biosynthetic pathway. 

1 29. The method of claim 28, wherein a first glycosyltransferase is a 

2 sialyltransferase. 

1 30. The method of claim 28, wherein a first glycosyltransferase is a 

2 eukaryotic GnTI. 

1 31. The method of claim 28, wherein a first glycosyltransferase is a 

2 galactosyltransferase. 
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1 32. The method of claim 28, wherein a first glycosyltransferase is a 

2 sialyltransferase, a second glycosyltransferase is an N-acetylglucosaminyltransferase, and a 

3 third glycosyltransferase is a galactosyltransferase. 

1 33. The method of claim 21, wherein the glycosyltransferases are part of 

2 an O-linked glycan biosynthetic pathway. 

1 34. The method of claim 33, wherein a first glycosyltransferase is a 

2 eukaryotic GalNAcT2. 

1 35. A reaction mixture for producing an oligosaccharide, the reaction 

2 mixture comprising at least two glycosyltransferases that have been refolded in the same 

3 vessel, wherein a first glycosyltransferase is a eukaryotic N-acetylglucosaminyltransferase I 

4 (GnTI) enzyme of claim 1 . 

1 36. The reaction mixture of claim 35, wherein a second glycosyltransferase 

2 is a sialyltransferase. 

1 37. The reaction mixture of claim 35, wherein a second glycosyltransferase 

2 is a galactosyltransferase. 

1 38. The reaction mixture of claim 35, wherein a second glycosyltransferase 

2 is a sialyltransferase, and a third glycosyltransferase is a galactosyltransferase. 

1 39. A method of producing an oligosaccharide, the comprising contacting 

2 an acceptor molecule with a donor sugar, and a reaction mixture of claim 35. 

1 40. A method of refolding an insoluble recombinant eukaryotic 

2 sialyltransferase, the method comprising the steps of: 

3 (a) solubilizing the sialyltransferase; and 

4 (b) contacting the soluble sialyltransferase with a buffer comprising a redox 

5 couple to refold the sialyltransferase, wherein the refolded sialyltransferase catalyzes the 

6 transfer of sialic acid from a donor substrate to an acceptor substrate. 

1 41 . The method of claim 40, further comprising the step of dialyzing or 

2 diafiltering the refolded sialyltransferase. 
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1 42. The method of claim 40, wherein the buffer further comprises a 

2 detergent. 

1 43. The method of claim 40, wherein the buffer further comprises a 

2 choatropic agent. 

1 44. The method of claim 40, wherein the buffer further comprises arginine. 

1 45. The method of claim 40, wherein the buffer pH is between 6.0 and 

2 10.0. 

1 46. The method of claim 45, wherein the buffer pH is between 6.5 and 8.0. 

1 47 . The method of claim 45, wherein the buffer pH is between 8.0 and 9.0. 

1 48. The method of claim 40, wherein the sialyltransferase comprises an 

2 amino acid tag. 

1 49. The method of claim 48, wherein the amino acid tag is selected from 

2 the group consisting of a maltose binding protein (MBP), a polyhistidine tag, a glutathione S 

3 transferase (GST), a starch binding protein (SBP), and a myc epitope. 

1 50. The method of claim 48, further comprising the step of purifying the 

2 sialyltransferase using a tag binding molecule. 

1 51. The method of claim 50, wherein the amino acid tag is MBP and the 

2 tag binding molecule is amylose, maltose, or a cyclodextrin. 

1 52. The method of claim 40, wherein the refolded sialyltransferase 

2 catalyzes the transfer of sialic acid from CMP-sialic acid to a glycoprotein. 

1 53. The method of claim 40, wherein the refolded sialyltransferase 

2 catalyzes the transfer of 10KPEG or 20K PEG from CMP-SA-PEG (10 kDa) or CMP-SA- 

3 PEG (20 kDa)to a glycoprotein. 

1 54. The method of claim 40, wherein the sialyltransferase is rat liver 

2 ST3Gaim. 
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1 55. The method of claim 54, wherein the recombinant mammalian 

2 sialyltransferase comprises a maltose binding protein (MBP) amino acid tag. 

1 56. The method of claim 55, further comprising the step of purifying the 

2 refolded mammalian sialyltransferase using a tag binding molecule selected from the group 

3 consisting of amylase, maltose, or a cyclodextrin. 

1 57. The method of claim 54, wherein the redox couple is reduced 

2 glutathione/oxidized glutathione (GSH/GSSG). 

1 58. The method of claim 57, wherein the molar ratio of GSH/GSSG is 

2 between 1 00: 1 and 1 : 1 0. 

1 59. The method of claim 54, wherein the buffer comprises about 0.02-10 

2 mM GSH, 0.005-10 mM GSSG, 0.005-10 mM lauryl maltoside, 50-250 mM NaCl, 2-10 mM 

3 KC1, 0.01-0.05% PEG 3350, and 150-550 mM L-arginine. 

1 60. A method of adding a sialyl moiety to a glycoprotein, the method 

2 comprising contacting the glycoprotein with CMP-sialic acid and a refolded mammalian 

3 sialyltransferase of claim 40. 

1 6 1 . A method of adding a PEG moiety to a glycoprotein, the method 

2 comprising contacting the glycoprotein with CMP- 1 0KPEG or CMP-20KPEG and a refolded 

3 mammalian sialyltransferase of claim 40. 

1 62. A method of refolding an insoluble recombinant eukaryotic N- 

2 acetylgalactosaminyltransferase 2 (GalNAcT2), the method comprising the steps of: 

3 (a) solubilizing the GalNAcT2 in a solubilization buffer; and 

4 (b) contacting the soluble GalNAcT2 with a refolding buffer comprising a 

5 redox couple to refold the GalNAcT2, wherein the refolded GalNAcT2 catalyzes the transfer 

6 of Af-acetylgalactosamine from a donor substrate to an acceptor substrate. 

1 63. The method of claim 62, further comprising the step of dialyzing or 

2 diafiltering the refolded GalNAcT2. 

1 64. The method of claim 62, wherein the refolding buffer further 

2 comprises a detergent. 
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1 65. The method of claim 62, wherein the refolding buffer further 

2 comprises a choatropic agent. 

1 66. The method of claim 62, wherein the refolding buffer further 

2 comprises arginine. 

1 67. The method of claim 62, wherein refolding the buffer pH is between 

2 6.0 and 10.0. 

1 68. The method of claim 62, wherein the redox couple is reduced 

2 glutathione/oxidized glutathione (GSH/GSSG). 

1 69. The method of claim 62, wherein the redox couple is cysteine/ 

2 cystamine. 

1 70. The method of claim 62, wherein the refolding buffer pH is about 8.0. 

1 71 . The method of claim 62, wherein the solubilization buffer pH is 

2 between 8.0 and 9.0. 

1 72. The method of claim 62, wherein the solubilization buffer pH is about 

2 8.0. 

1 73. The method of claim 62, wherein the GalNAcT2 comprises an amino 

2 acid tag. 

1 74. The method of claim 73, wherein the amino acid tag is selected from 

2 the group consisting of a maltose binding protein (MBP), a polyhistidine tag, a glutathione S 

3 transferase (GST), a starch binding protein (SBP), and a myc epitope. 

1 75. The method of claim 73, further comprising the step of purifying the 

2 GalNAcT2 using a tag binding molecule. 

1 76. The method of claim 75, wherein the amino acid tag is MBP and the 

2 tag binding molecule is amylose, maltose, or a cyclodextrin. 
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1 77. The method of claim 62, wherein the refolded GalNAcT2 catalyzes the 

2 transfer of .^-acetylgalactosamine from a donor substrate to a peptide, a protein, a 

3 glycopeptide or a glycoprotein. 
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PATENT 

Attorney Docket No.: 019957-016820US 
Client Reference No.: NEO00255 PR 

METHODS OF REFOLDING MAMMALIAN 
GLYCOSYLTRANSFERASES 

ABSTRACT OF THE DISCLOSURE 
The present invention provides methods of refolding mammalian 
glycosyltransferases that have been produced in bacterial cells, and methods to use such 
refolded glycosyltransferases, including glycosyltransferase mutants that have enhanced 
ability to be refolded. The invention also provides methods of refolding more than one 
glycosyltransferase in a single vessel, methods to use such refolded glycosyltransferases, and 
reaction mixtures comprising the refolded glycosyltransferases. 

60355614 vl 
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Figure 2 




(20 K) EPO 



Figure 5. GlycoPEGylation (20 K)of EPO 



Figure 5 



o n 



10 20 30 40 50 60 

/usr/t MLKKQSAGLVLWQAILFVAWNALIjIiIjFFWTRPAPGRPPSVSALDGDPASLTREVIRLAQD 

P27115 MLKKQSAGLVLWGAILFVAWNAIiLLLFFWTRPVPSRLPSDNALDDDPASLTREVIRLAQD 
10 20 30 40 50 60 

70 80 90 100 110 

/usr/t AEVELERQRGLLQQIGD- -ALSSQRGRVPTAAPPAQPRVPVTPAPAVIPILVIACDRSTV 

P27115 AEVEIiERQRGLIiQQIREHHALWSQRWKVPTAAPPAQPHVPVTPPPAVIPILVIACDRSTV 
70 80 90 100 110 120 

120 130 140 150 160 170 

/usr/t RRCLD KLLHYRPS AEL FP 1 I VSQDCGHEETAQAIAS YGSAVTHI RQPDLSS IAVPPDHRK 

P27115 PJICLDKLIjHYRPSAELFPIIVSQDCGHEETAQVIASYGSAVTHIRQPDLSNIAVQPDHRK 
130 140 150 160 170 180 

180 190 200 210 220 230 

/usr/t FQGYYKlARHYRWALGQVFRQFRFPAAVVVEDDLEVAPDFFEYFRATyPLIjKADPSLWCV 

P2 7 1 1 5 FQGYYKIARHYRWALGQI FHNFNYPAAWVEDDLEYAPDFFE YFQATYPLLKADPSLWCV 
190 200 210 220 230 240 

240 250 260 270 280 290 

/usr/t SAWNDNGKEQMVDASRPEUYRTDFFPGLGWIiljIiAEIiHAELEPKWPKAFWDDWMRRPEQR 



P27115 SAWNDNGKEQMVDSSKPELLYRTDFFPGLGWLLLAELWAELEPKWPKAFWDDWMRRPEQR 
250 260 270 280 290 300 

300 310 320 330 340 350 

/usr/t QGRACIRPEISRTMTFGRKGVSHGQFFDQHLKFIKIiNQQFVHFTQIiDLSYIiQREAYDRDF 

P27115 KGRACVRPE I SRTMTFGRKGVSHGQFFDQHLKF I KLNQQFVPFTQIjDIjS YLQQEAYDRDF 
310 320 330 340 350 360 

360 370 380 390 400 410 

/usr/t LARVYGAPQLQVE KVRTNDRKELGEVRVQ YTGRDS FKAFAKALGVMDDLKSGVPRAG YRG 

P27115 LARVYGAPQLQVE KVRTNDRKEIXjF/VTRVQYTGRDSFKAFAKAIjG^ 

370 380 390 400 410 420 

420 430 440 

/usr/t IVTFQFRGRRVHLAPPPTWEGYDPSWN 

P27115 IVTFLFRGRRVHIiAPPQTWDGYDPSWT 
430 440 
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GnTl Cysl21Ser mutant 

avipilviacdrstvrrsldkllhyipsael^iivsqdcgheetaqaiasygsavthirqpdlssiavppdhrk 

Igqv^fiipaavweddlevapdffeyfoty^ 

lepkwpkafwddwmnpeqrqgraci^ 

apqlqvekvrtnclrkelgevrvqytgrd^^ 



Gcggtgattcccatcctggtcatcgcctgtgaccgcagcactg^cggcgctctctagacaagctgctgcattatcggcc^^^ 

gctcttccccatcatcgttagccaggactgcgggcacgaggagacggcccaggccatcgcctcctacggcagcgcggtcacgcaca 

tccggcagcccgacctgagcagcattgcggtgccgccggatxaccgcaagttccagggctactacaagatcgcgcgccactaccg 

ctgggcgctgggccaggtcttccggcagmcgcttccccgcggccgtggtggtggaggatgacctggaggtggccccggac^^ 

cgagtactttcgggccacctatccgctgctgaaggccgacccctccctgtggtgcgtctcggcctggaatgacaa^ 

gatggtggacgccagcaggcctgagctgctctaccgcaccgacttmccctggcctgggctggctgctgttggccgagctctgg 

gagctggagcccaagtggccaaaggccttctgggacgactggatgcggcggccggagcagcggcaggggcgggcctgcatacg 

ccctgagatctcaagaacgatgacctttggcxgcaagggtgtgagccacgggcagttctttgaccagcacctcaagtttatcaagctga 

accagcagtttgtgcacttcacccagctggacctgtcttacctgcagcgggaggcctatgaccgagamcctcgcccgcgtc^ 

gctccccagctgcaggtggagaaagtgaggaccaatgaccggaaggagctgggggaggtgcgggtgcagtatacgggcaggga 

cagcttcaaggctttcgccaaggctctgggtgtcatggatgaccttaagtcgggggttccgagagctggctaccggggtattgtcacctt 

ccagttcccgggccgccgtgtccacctggcgcccccaccgacgtgggagggctatgatcctagctggaattag 



Figure 7 
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GnTl Cysl21Asp 



avipilviacdrstvrrdldMlhyrpsael^^ 

lgqv^^aavweddlevapdffeyfratypllta^ 

lepkwpkafwddwmirpeqrqgrac^^ 

apqlqvekvrtndrkelgevi^q^grdsfkafakalgvmddlksgvpragyrgivtfq^ 



Gcggtgattcccatcctggtcatcgcctgtgaccgcagcactgttcggcgcgatctagacaagctgctgcattatcggccctcggctg 
agctottccccatcatcgttagccaggactgcgggcacgaggagacggcccaggccatcgcctcctacggcagcgcggtcacgcac 
atccggcagcccgacctgagcagcattgcggtgccgccgga«:accgcaagttocagggctactacaagatcgcgcg^ 
gctgggcgctgggccaggjcttccggcagtttcgcttccc^ 

tcgagtactttcgggccacctatccgctgctgaaggccgacccctccctgtggtgcgtctcggcctggaatga^ 
agatggtggacgccagcaggcctgagctgctctac«gcaccgac^ 

tgagctggagcccaagtggccaaaggccttctgggacgactggatgcggcggccggagcagcggcaggggcgggcctgcatac 

gc^tgagatctcaagaacgatgacctttggc^gcaagggtgtgagccacgggcagttctttgaccagcacctcaagtttatcaagctg 

aaccagcagtttgtgcacttcacccagctggacctgtcttacctgcagcgggaggcctatgaccgagaWcctcg 

tgctccccagctgcaggtggagaaagtgaggaccaatgaccggaaggagctgggggaggtgcgggtgcagtatacgggcaggga 

cagcttcaaggcmcgccaaggctctgggtgtcatggatgaccttaagtcgggggtt(^gagagctggctaccggggtattgte^ 

ccagttcccgggccgccgtgtccacctggcgcccccaccgacgtgggagggctatgatcctagctggaattag 



Figure 8 
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GnTl Cysl21Thr 

avipnviacdretvrrtldkllh>opsaelf^ 
gqvfrqfrfpaavweddlevapdffeyfotypllkadpslw^^ 
epkwpkafwddwmrrpeqrqgracirpeisrtmtf^ 
apqlqvekvrtndikelgevrvqytgrdsfkafakalgvmddlksgvpragyrgivt^ 

Gcggtgattcccatcctggtcatcgcctgtgaccgcagcactgttcggcgcactctagacaagctgctgcattatcggccctcggctg 
agctcttccwatcatcgttagccaggactgcgggcacgaggagacggcccaggccatcgcctcctacggcagcgcggtcacgcac 



gctgggcgctgggccaggtcttccggcagtttcgcttccccgcggccgtggtggtggaggatgacctggaggtggcc^ 

tcgagtactttcgggccacctatccgctgctgaaggccgacccctccctgtggtgcgtctcggcctggaatga^ 

agatggtggacgccagcaggcctgagctgctctaccgcaccgactttttccctggcctgggct^ 



aaccagcagmgtgcacttcacccagctggacctgtcttacctgcagcgggaggcctatgaccgagamcctcgcc^ 
tgctccccagctgcaggtggagaaagtgaggaccaatgaccggaaggagctgggggaggtgcgggtgcagtatacgggcaggga 
cagcttcaaggctttcgccaaggctctgggtgtcatggatgaccttaagtcgggggttccgagagctggctaccggggtattgt^ 
ccagttcccgggccgccgtgtccacctggcgcccccaccgacgtgggagggctatgatcctagctggaattag 



Figure 9 
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GnTl Cysl21Ala 



avipilviacdrstviraldkllhyipsaelipiivsqdc^ 
lgqvfrqfirfpaavweddlevapdffe;^^ 
lepkwpkarwddwmrrpeqrqgrac^ 
apqlqvekvrtndrkelgevivqytgKlsfk^^ 



Gcggtgattcccatcctggtcatcgcctgtgaccgcagcactgttcggcgcgccctagacaagctgctgcattatcggccctcggctg 
agctcttccccatcatcgttagccaggactgcgggcacgaggagacggcccaggccatcgcctcctacggcagcgcggtcacgcac 
atccggcagcccgacxtgagcagcattgcggtgccgccggarc 

gctgggcgctgggccaggtcttccggcagtttcgcttccccgcggccgtggtggtggaggatgacctggaggtggccccggacttct 
tcgagtactttcgggccacctatccgctgctgaaggc^^ 

agatggtggacgccagcaggcctgagctgctctaccgcaccgactttttccctggcctgggctggctgctgtt 
tgagctggagcccaagtggccaaaggccttctgggacgactggatgcggcggccggagcagcggcaggggcgggcctgcatac 
g(^tgagatctcaagaacgatgaccmgga;gcaagggtgtgagc^cgggcagttctttgaccagcacctcaagtttatcaagctg 
aaccagcagtttgfgcacttcaccragctggacctgtctt^ 

tgctccccagctgcaggtggagaaagtgaggaccaatgaccggaaggagctgggggaggtgcgggtgcagtatacgggcaggga 
<^gcttcaaggctttcgccaaggctctgggtgtcatggatgaccttaagtcgggggttccgagagctggcta(xggggtattgtcacctt 
ccag^cccgggccgccgtgtccacctggcgcccccaccgacgtgggagggctatgatcctagctggaattag 



Figure 10 
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GnTl Argl20Ala, Cysl21H 



avipilviacdretvrahldldlhyrpsaelfpito^ 
algqvfrqfrfpaavvveddlevapdfFeyfratypllkadpslwcvsawnc^ 
elepkwpkafwddwrrmpeqrqgrac^ 
gapqlqvekvrtndrkelgevrvqytgrdsfkaf^ 



Gcggtgattcccatcctggtcatcgcctgtgaccgcagcactgttcgggcccacctagacaagctgctgcattatcggccctcggctg 
agctcttccccatcatcgttagccaggactgcgggcacgaggagacggcccaggccatcgcctcctacggcagcgcggtcacgcac 
atccggcagcccgacctgagcagcattgcggtgccgccggaccaccgcaagttccagggctactacaagatcgcgcgccactacc 
gctgggcgctgggccaggtcttccggcagfttcgcttcc^ 

tcgagtactttcgggc^acctatccgctgctgaaggccgacccctccctg^ggtgcgtctcggcctggaatga 

agatggtggacgccagcaggcctgagctgctctaccgcaccgacttmccctggcctgggctggctgctgttggccgagctctgg^^ 

tgagctggagcccaagtggccaaaggccttctgggacgactggatgcggcggccggagcagcggcaggggcgggcctgcatac 

gccctgagatctcaagaacgatgacctttggccgcaagggtgtgagccacgggcagttctttgaccagcacctcaagtttatcaagctg 

aaccagcagtttgtgcacttcacccagctggacctgfc^ 

tgctccccagctgcaggtggagaaagtgaggaccaatgaccggaaggagctgggggaggtgcgggtgcagtatacgggcaggga 

cagcttcaaggctttcgccaaggctctgggtgtcatggatgaccttaagtcgggggttccgagagctggctaccggggtatt^ 

ccagttcccgggccgccgtgtccacctggcgcccccaccgacgtgggagggctatgatcctagctggaattag 



Figure 11 
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Rat Liver ST3Gal III amino acid sequence: 

GFLIXLDSKLPAELATKYANFSEGACKPGYASAMMTAIFPRFSKPAPMFLDDSFRKW 

ARIREFWPFGIKGQDNLIKAII^VTKEYRLTPAIJDSIJICRRCIIVGNGGVLANKSLGS 

RTODYDrVTRLNSAP^GFEKDVGSKTTLRITYPEGAMQRPEQYERDSIJVLAGFKW 

QDFKWLKYTVYKERVSASDGFWKSVATRVPKEPPEIRILNPYFIQEAAFTLIGLPFNN 

GIJvlGRGNffTLGSVAVTMALDGCDEVAVAGFGYDMNTPNAPLHYYETVRMAAIKE 

SWTHNIQREKEFLRKLVKARVITDLSSGI 



Figure 12 
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FuU length UDP-N-acetylgalactosanwiyltransferase 2 (GalNAcT2) nucleic acid and 
amino acid sequences 



Amino acid sequence 

Met Arg Arg Arg Ser Arg Met Leu Leu Cys Phe Ala Phe Leu Trp Val 
1 5 10 15 

Leu Gly lie Ala Tyr Tyr Met Tyr Ser Gly Gly Gly Ser Ala Leu Ala 
20 " 25 30 

Gly Gly Ala Gly Gly Gly Ala Gly Arg Lys Glu Asp Trp Asn Glu lie 
35 " 40 45 

Asp Pro He Lys Lys Lys Asp Leu His His Ser Asn Gly Glu Glu Lys 
50 55 60 

Ala Gin Ser Met Glu Thr Leu Pro Pro Gly Lys Val Arg Trp Pro Asp 
65 70 75 80 

Phe Asn Gin Glu Ala Tyr Val Gly Gly Thr Met Val Arg Ser Gly Gin 
85 90 95 

Asp Pro Tyr Ala Arg Asn Lys Phe Asn Gin Val Glu Ser Asp Lys Leu 
100 105 HO 

Arg Met Asp Arg Ala He Pro Asp Thr Arg His Asp Gin Cys Gin Arg 
115 120 125 

Lys Gin Trp Arg Val Asp Leu Pro Ala Thr Ser Val Val lie Thr Phe 
130 135 140 

His Asn Glu Ala Arg Ser Ala Leu Leu Arg Thr Val Val Ser Val Leu 
145 150 155 160 

Lys Lys Ser Pro Pro His Leu He Lys Glu He He Leu Val Asp Asp 
165 170 175 

Tyr Ser Asn Asp Pro Glu Asp Gly Ala Leu Leu Gly Lys He Glu Lys 
180 185 190 

Val Arg Val Leu Arg Asn Asp Arg Arg Glu Gly Leu Met Arg Ser Arg 
195 200 205 

Val Arg Gly Ala Asp Ala Ala Gin Ala Lys Val Leu Thr Phe Leu Asp 
210 215 220 

Ser His Cys Glu Cys Asn Glu His Trp Leu Glu Pro Leu Leu Glu Arg 
225 * 230 235 240 

Val Ala Glu Asp Arg Thr Arg Val Val Ser Pro He He Asp Val He 
245 250 255 

Asn Met Asp Asn Phe Gin Tyr Val Gly Ala Ser Ala Asp Leu Lys Gly 
260 265 270 

Gly Phe Asp Trp Asn Leu Val Phe Lys Trp Asp Tyr Met Thr Pro Glu 
275 280 285 
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Gin Arg Arg Ser Arg Gin Gly Asn Pro Val Ala Pro lie Lys Thr Pro 
290 295 300 

Met lie Ala Gly Gly Leu Phe Val Met Asp Lys Phe Tyr Phe Glu Glu 
305 310 315 320 

Leu Gly Lys Tyr Asp Met Met Met Asp Val Trp Gly Gly Glu Asn Leu 
325 330 335 

Glu lie Ser Phe Arg Val Trp Gin Cys Gly Oly Ser Leu Glu lie lie 
340 345 350 

Pro Cys Ser Arg Val Gly His Val Phe Arg Lys Gin His Pro Tyr Thr 
355 360 365 

Phe Pro Gly Gly Ser Gly Thr Val Phe Ala Arg Asn Thr Arg Arg Ala 
370 375 380 

Ala Glu Val Trp Met Asp Glu Tyr Lys Asn Phe Tyr Tyr Ala Ala Val 
385 390 395 400 

Pro Ser Ala Arg Asn Val Pro Tyr Gly Asn lie Gin Ser Arg Leu Glu 
405 410 415 

Leu Arg Lys Lys Leu Ser Cys Lys Pro Phe Lys Trp Tyr Leu Glu Asn 
420 425 430 

Val Tyr Pro Glu Leu Arg Val Pro Asp His Gin Asp lie Ala Phe Gly 
435 440 445 

Ala Leu Gin Gin Gly Thr Asn Cys Leu Asp Thr Leu Gly His Phe Ala 
450 455 460 

Asp Gly Val Val Gly Val Tyr Glu Cys His Asn Ala Gly Gly Asn Gin 
465 470 475 480 

Glu Trp Ala Leu Thr Lys Glu Lys Ser Val Lys His Met Asp Leu Cys 
485 490 495 

Leu Thr Val Val Asp Arg Ala Pro Gly Ser Leu lie Lys Leu Gin Gly 
500 505 510 

Cys Arg Glu Asn Asp Ser Arg Gin Lys Trp Glu Gin lie Glu Gly Asn 
515 * 520 525 

Ser Lys Leu Arg His Val Gly Ser Asn Leu Cys Leu Asp Ser Arg Thr 
530 535 540 

Ala Lys Ser Gly Gly Leu Ser Val Glu Val Cys Gly Pro Ala Leu Ser 
545 550 555 560 

Gin Gin Trp Lys Phe Thr Leu Asn Leu Gin Gin 



565 



570 
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Nucleic acid sequence 

atgcggcggc gctcgcggat gctgctctgc ttcgccttcc tgtgggtgct gggcatcgcc 60 

tactacatgt actcgggggg cggctctgcg ctggccgggg gcgcgggcgg cggcgccggc 120 

aggaaggagg actggaatga aattgacccc attaaaaaga aagaccttca tcacagcaat 180 

ggagaagaga aagcacaaag catggagacc ctccctccag ggaaagtacg gtggccagac 240 

tttaaccagg aagcttatgt tggagggacg atggtccgct ccgggcagga cccttacgcc 300 

cgcaacaagt tcaaccaggt ggagagtgat aagcttcgaa tggacagagc catccctgac 360 

acccggcatg accagtgtca gcggaagcag tggcgggtgg atctgccggc caccagcgtg 420 

gtgatcacgt ttcacaatga agccaggtcg gccctactca ggaccgtggt cagcgtgctt 480 

aagaaaagcc cgccccatct cataaaagaa atcatcttgg tggatgacta cagcaatgat 540 

cctgaggacg gggctctctt ggggaaaatt gagaaagtgc gagttcttag aaatgatcga 600 

cgagaaggcc tcatgcgctc acgggttcgg ggggccgatg ctgcccaagc caaggtcctg 660 

accttcctgg acagtcactg cgagtgtaat gagcactggc tggagcccct cctggaaagg 720 

gtggcggagg acaggactcg ggttgtgtca cccatcatcg atgtcattaa tatggacaac 780 

tttcagtatg tgggggcatc tgctgacttg aagggcggtt ttgattggaa cttggtattc 840 

aagtgggatt acatgacgcc tgagcagaga aggtcccggc aggggaaccc agtcgcccct. 900 

ataaaaaccc ccatgattgc tggtgggctg tttgtgatgg ataagttcta ttttgaagaa 960 

ctggggaagt acgacatgat gatggatgtg tggggaggag agaacctaga gatctcgttc 1020 

cgcgtgtggc agtgtggtgg 'cagcctggag atcatcccgt gcagccgtgt gggacacgtg 1080 

ttccggaagc agcaccccta cacgttcccg ggtggcagtg gcactgtctt tgcccgaaac 1140 

acccgccggg cagcagaggt ctggatggat gaatacaaaa atttctatta tgcagcagtg 1200 

ccttctgcta gaaacgttcc ttatggaaat attcagagca gattggagct taggaagaaa 1260 

ctcagctgca agcctttcaa atggtacctt gaaaatgtct atccagagtt aagggttcca 1320 

gaccatcagg atatagcttt tggggccttg cagcagggaa ctaactgcct cgacactttg 1380 

ggacactttg ctgatggtgt ggttggagtt tatgaatgtc acaatgctgg gggaaaccag 1440 

gaatgggcct tgacgaagga gaagtcggtg aagcacatgg atttgtgcct tactgtggtg 1500 

gaccgggcac cgggctctct tataaagctg cagggctgcc gagaaaatga cagcagacag 1560 

aaatgggaac agatcgaggg caactccaag ctgaggcacg tgggcagcaa cctgtgcctg 1620 

gacagtcgca cggccaagag cgggggccta agcgtggagg tgtgtggccc ggccctttcg 1680 

cagcagtgga agttcacgct caacctgcag cag 1713 
Figure 13B 
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A51 UDP-N-acetylgalactosaminyltransferase 2, GalNAcT2, nucleic acid and 
sequences 

Amino acid sequence 

Lys Lys Lys Asp Leu His His Ser Asn Gly Olu Glu Lys Ala Gin Ser 
15 10 15 

Met Glu Thr Leu Pro Pro Gly Lys Val Arg Trp Pro Asp Phe Asn Gin 
20 25 30 

Glu Ala Tyr Val Gly Gly Thr Met Val Arg Ser Gly Gin Asp Pro Tyr 
35 40 45 

Ala Arg Asn Lys Phe Asn Gin Val Glu Ser Asp Lys Leu Arg Met Asp 
50 55 60 

Arg Ala lie Pro Asp Thr Arg His Asp Gin Cys Gin Arg Lys Gin Trp 
65 70 75 80 

Arg Val Asp Leu Pro Ala Thr Ser Val Val He Thr Phe His Asn Glu 
85 90 95 

Ala Arg Ser Ala Leu Leu Arg Thr Val Val Ser Val Leu Lys Lys Ser 
100 105 110 

Pro Pro His Leu He Lys Glu He He Leu Val Asp Asp Tyr Ser Asn 
115 120 125 

Asp Pro Glu Asp Gly Ala Leu Leu Gly Lys He Glu Lys Val Arg Val 
13 0 135 140 

Leu Arg Asn Asp Arg Arg Glu Gly Leu Met Arg Ser Arg Val Arg Gly 
145 150 155 160 

Ala Asp Ala Ala Gin Ala Lys Val Leu Thr Phe Leu Asp Ser His Cys 
165 170 175 

Glu Cys Asn Glu His Trp Leu Glu Pro Leu Leu Glu Arg Val Ala Glu 
180 185 190 

Asp Arg Thr Arg Val Val Ser Pro He He Asp Val He Asn Met Asp 
195 200 205 

Asn Phe Gin Tyr Val Gly Ala Ser Ala Asp Leu Lys Gly Gly Phe Asp 
210 215 220 

Trp Asn Leu Val Phe Lys Trp Asp Tyr Met Thr Pro Glu Gin Arg Arg 
225 230 235 240 

Ser Arg Gin Gly Asn Pro Val Ala Pro He Lys Thr Pro Met He Ala 
245 250 255 

Gly Gly Leu Phe Val Met Asp Lys Phe Tyr Phe Glu Glu Leu Gly Lys 
260 265 270 

Tyr Asp Met Met Met Asp Val Trp Gly Gly Glu Asn Leu Glu He Ser 
275 280 285 

Phe Arg Val Trp Gin Cys Gly Gly Ser Leu Glu He He Pro Cys Ser 
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290 



295 



300 



Arg Val Gly His Val Phe Arg Lye Gin His Pro Tyr Thr Phe Pro Gly 
305 310 315 320 

Gly Ser Gly Thr Val Phe Ala Arg Asn Thr Arg Arg Ala Ala Glu Val 
325 330 335 

Trp Met Asp Glu Tyr Lys Asn Phe Tyr Tyr Ala Ala Val Pro Ser Ala 
340 345 350 

Arg Asn Val Pro Tyr Gly Asn lie Gin Ser Arg Leu Glu Leu Arg Lys 
355 360 " 365 

Lys Leu Ser Cys Lys Pro Phe Lys Trp Tyr Leu Glu Asn Val Tyr Pro 
370 375 380 

Glu Leu Arg Val Pro Asp His Gin Asp lie Ala Phe Gly Ala Leu Gin 
385 390 395 400 

Gin Gly Thr Asn Cys Leu Asp Thr Leu Gly His Phe Ala Asp Gly Val 
405 410 415 

Val Gly Val Tyr Glu Cys His Asn Ala Gly Gly Asn Gin Glu Trp Ala 
420 425 430 

Leu Thr Lys Glu Lys Ser Val Lys His Met Asp Leu Cys Leu Thr Val 
435 440 445 

Val Asp Arg Ala Pro Gly Ser Leu He Lys Leu Gin Gly Cys Arg Glu 
450 455 460 

Asn Asp Ser Arg Gin Lys Trp Glu Gin lie Glu Gly Asn Ser Lys Leu 
465 470 475 480 

Arg His Val Gly Ser Asn Leu Cys Leu Asp Ser Arg Thr Ala Lys Ser 
485 490 495 

Gly Gly Leu Ser Val Glu Val Cys Gly Pro Ala Leu Ser Gin Gin Trp 
500 505 " 510 

Lys Phe Thr Leu Asn Leu Gin Gin 



515 



520 
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Nucleic acid sequence 








aaaaagaaag accttcatca cagcaatgga 


gaagagaaag 


cacaaagcat ggagaccctc 


60 


cctccaggga aagtacggtg gccagacttt 


aaccaggaag 


cttatgttgg agggacgatg 


120 


gtccgctccg ggcaggaccc ttacgcccgc 


aacaagttca 


accaggtgga gagtgataag 


180 


cttcgaatgg acagagccat ccctgacacc 


cggcatgacc 


agtgtcagcg gaagcagtgg 


240 


cgggtggatc tgccggccac cagcgtggtg 


atcacgtttc 


acaatgaagc caggtcggcc 


300 


ctactcagga ccgtggtcag cgtgcttaag 


aaaagcccgc 


cccatctcat aaaagaaatc 


360 


atcttggtgg atgactacag caatgatcct 


gaggacgggg 


ctctcttggg gaaaattgag 


420 


aaagtgcgag ttcttagaaa tgatcgacga 


gaaggcctca 


tgcgctcacg ggttcggggg 


480 


gccgatgctg cccaagccaa ggtcctgacc 


ttcctggaca 


gtcactgcga gtgtaatgag 


540 


cactggctgg agcccctcct ggaaagggtg 


gcggaggaca 


ggactcgggt tgtgtcaccc 


600 


atcatcgatg tcattaatat ggacaacttt 


cagtatgtgg 


gggcatctgc tgacttgaag 


660 


ggcggttttg attggaactt ggtattcaag 


tgggattaca 


tgacgcctga gcagagaagg 


720 


tcccggcagg ggaacccagt cgcccctata 


aaaaccccca 


tgattgctgg tgggctgttt 


780 


gtgatggata agttctattt tgaagaactg 


gggaagtacg 


acatgatgat ggatgtgtgg 


840 


ggaggagaga acctagagat ctcgttccgc 


gtgtggcagt 


gtggtggcag cctggagatc 


900 


atcccgtgca gccgtgtggg acacgtgttc 


cggaagcagc 


acccctacac gttcccgggt 


960 


ggcagtggca ctgtctttgc ccgaaacacc 


cgccgggcag 


cagaggtctg gatggatgaa 


1020 


tacaaaaatt tctattatgc agcagtgcct 


tctgctagaa 


acgttcctta tggaaatatt 


1080 


cagagcagat tggagcttag gaagaaactc 


agctgcaagc 


ctttcaaatg gtaccttgaa 


1140 


aatgtctatc cagagttaag ggttccagac 


catcaggata 


tagcttttgg ggccttgcag 


1200 


cagggaacta actgcctcga cactttggga 


cactttgctg 


atggtgtggt tggagtttat 


1260 


gaatgtcaca atgctggggg aaaccaggaa 


tgggccttga 


cgaaggagaa gtcggtgaag 


1320 


cacatggatt tgtgccttac tgtggtggac 


cgggcaccgg 


gctctcttat aaagctgcag 


1380 


ggctgccgag aaaatgacag cagacagaaa 


tgggaacaga 


tcgagggcaa ctccaagctg 


1440 


aggcacgtgg gcagcaacct gtgcctggac 


agtcgcacgg 


ccaagagcgg gggcctaagc 


1500 


gtggaggtgt gtggcccggc cctttcgcag 


cagtggaagt 


tcacgctcaa cctgcagcag 


1560 
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14 Fern Avenue 
Willow Grove 
PA 
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Country of mailing address:: US 
Postal or Zip Code of mailing address:: 19090 



Applicant Authority Type:: 
Primary Citizenship Country:: 
Status- 
Given Name:: 
Middle Name- 
Family Name:: 
Name Suffix- 
City of Residence- 
State or Province of Residence- 
Country of Residence- 
Street of Mailing Address- 
City of Mailing Address- 
State or Province of mailing address- 
Country of mailing address- 
Postal or Zip Code of mailing address- 



Inventor 
US 

Full Capacity 
Scott 

Willett 

Doylestown 

PA 

US 

3820 Comley Circle 

Doylestown 

PA 

US 

19801 



Applicant Authority Type- 
Primary Citizenship Country:: 
Status- 
Given Name:: 
Middle Name- 
Family Name:: 
Name Suffix:: 
City of Residence:: 
State or Province of Residence- 
Country of Residence- 
Street of Mailing Address- 
City of Mailing Address:: 



Inventor 
US 

Full Capacity 

Karl 

F. 

Johnson 

Hatboro 

PA 

US 

5320 Ivystream Road 
Hatboro 



Page 3 



Initial 11/12/04 



State or Province of mailing address- 
Country of mailing address:: 
Postal or Zip Code of mailing address:: 



PA 
US 

19040 



Applicant Authority Type:: 
Primary Citizenship Country:: 
Status- 
Given Name:: 
Middle Name:: 
Family Name:: 
Name Suffix:: 
City of Residence- 
State or Province of Residence- 
Country of Residence- 
Street of Mailing Address:: 
City of Mailing Address:: 
State or Province of mailing address:: 
Country of mailing address- 
Postal or Zip Code of mailing address- 



Inventor 
US 

Full Capacity 
Daniel 
James 
Bezila 

Philadelphia 

PA 

US 

715 Red Lion Road, 2nd Floor 

Philadelphia 

PA 

US 

19115 



Applicant Authority Type- 
Primary Citizenship Country:: 
Status- 
Given Name:: 
Middle Name:: 
Family Name:: 
Name Suffix- 
City of Residence:: 
State or Province of Residence- 
Country of Residence:: 
Street of Mailing Address:: 



Inventor 
US 

Full Capacity 
Shawn 



North Wales 

PA 

US 

126 Filly Drive 
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City of Mailing Address:: North Wales 

State or Province of mailing address:: PA 

Country of mailing address:: US 

Postal or Zip Code of mailing address:: 19454 



Correspondence Information 

Correspondence Customer Number:: 20350 

Representative Information 

Representative Customer Number:: 20350 

Domestic Priority Information 

Application:: Continuity Type:: Parent Application:: Parent Filing Date:: 



Foreign Priority Information 

Country:: Application number:: Filing Date:: 

Assignee Information 

Assignee Name:: 
Street of mailing address- 
City of mailing address- 
State or Province of mailing address- 
Country of mailing address- 
Postal or Zip Code of mailing address- 



Page 5 



Initial 11/12/04 



